Malware Detection: measuring recall to catch them all feature image

Malware Detection: Measuring Recall to Catch Them All

At Wordfence, we take performance seriously on all levels. While speed is one way to measure performance, there are other metrics that are equally important. Over the past year, our Threat Intelligence team has improved our malware scan by leaps and bounds. We wanted to share some of the metrics we use and what they mean for our customers. We’ll also take a brief look at the new Jetpack Scan and see how it compares.

Measuring Recall to catch them all

Wordfence currently has more than 1.5 million malware samples on file, ranging from backdoors and shells to SEO spam. At any given time we use several thousand “signatures” to detect these malicious files. A signature is a pattern that is used to algorithmically match malware. Our signatures use regular expressions that are optimized to be highly performant and compatible with a range of platforms, from PHP regex to server-based pattern engines.

One of the most important metrics we use is Recall. Simply put, Recall is the percentage of known malicious files that are detected by our signatures.

Over the past year we have continuously improved this recall rate, and currently our signatures detect over 98% of known malicious content. Our Threat Intelligence team is constantly adding to and improving on these signatures, in order to detect emerging threats.

Keeping False Positives low

It’s just as important to make sure that our scanner doesn’t mistake legitimate software for malware. This is a situation known as a False Positive. We have a number of ways to prevent our customers from experiencing False Positives, and very few customers will ever see one.

Preventing False Positives starts with the malware detection signatures themselves, and we test them against millions of examples of legitimate web application code, ranging from popular plugins to home-brewed contact forms.

Over the past year we have improved the False Positive rate of our malware signatures from 1.1% to less than 0.03%. That is, our signatures mistakenly detect less than 3 in 10,000 known good files.

Speed is still critical

We measure each malware detection signature to ensure it doesn’t slow down the Wordfence scanner, replacing slow signatures with faster ones that detect even more malware. Over the past year, the combined speed of our malware detection signatures has increased by over 70%, which means Wordfence can scan your site faster than ever before.

How the competition measures up

We recently had an opportunity to test the performance of the new Jetpack Scan. Similar to the Vaultpress scan that was previously available as a Jetpack upgrade, Jetpack Scan uploads a helper file to your site and downloads any files it finds to their servers in order to scan them for malware.

For each of our malware detection signatures, we placed a malicious file that was detected by that particular signature on a test server in the wp-content/uploads folder. After running the Jetpack Scan, we found that it had a recall rate of 11.5%. We used a total of 2982 malware samples in the test from our collection of over 1.5 million samples. That means that for a total of 2982 malware samples, 342 were detected by Jetpack in our testing.

It’s worth noting that our scan had a 100% recall rate on these samples, but that is because these are samples that we know of. So there is a case of selection bias here in that we are choosing samples that we already detect. While our collection of 1.5 million malware samples is substantial, we do not know the size of Jetpack’s own malware collection, and how their recall rate compares against their own collection. It is possible their collection contains malware that we don’t know about, and that their scanner has an enormously high detection rate for malware that is simply not on our radar.

Much of our own malware database comes from our site cleaning business, where we analyze sites that have been hacked, and where we collect indicators of compromise, including malware samples, malicious IP addresses, and malicious domains. Having a constant flow of customers that have encountered a real-world intrusion provides us with a continuous portrait of emerging threats.

We were unable to test Jetpack Scan’s False Positive rate since this would have required a much larger number of files to be scanned.

Conclusion

In this article, we have covered some of the ways we measure malware scan performance, how our performance has changed over time, and how Jetpack Scan compares. Knowing how to measure performance is important, but it is only the first step. The large and rapid improvements we have made in malware detection have been the result of a concerted and ongoing effort by the Wordfence team. We improved because we made it a priority, and it is a priority because we have a team that is dedicated to making WordPress safer.

This article was written by Ramuel Gall, a former Wordfence Senior Security Researcher.

Did you enjoy this post? Share it!

Comments

2 Comments
  • Have you ever considered releasing a CLI variant of the malware scanner as a product? This would be interesting for us as a webhosting provider, so our support techs can analyze potentially hacked WordPress sites faster, without the need to change anything on the customers installation. Gravityscan used to fulfill that need for one-time assessments.

    • Hi Marc,

      We don't have anything like this on the roadmap as far as I'm aware, but we do try to keep track of feature requests. I'll let our developers know there's interest in this kind of functionality.

      Thanks!