WordPress Security: Remote Scanning vs Source Code Scanning

After chatting to old and new friends at WordCamp San Francisco over the weekend about WordPress security I realized there’s some confusion about what the real value is of scanning your website source code vs remote scanning for infections on your website. So I’ve put together a quick post on what some of the differences are to try and help you improve your WordPress security as a whole.

Wordfence scan’s your website source code. We can do this because you install our WordPress security plugin on your WordPress website and we execute PHP code in your native hosting environment to do the scan. This service is completely free and the Wordfence plugin that does the scan is open source, just like WordPress itself. We also don’t try to “upsell” you during the process – we simply do the scan and present the data.

This allows us to take a deep look at every piece of source code on your website. We can even examine code that is not part of your WordPress site but is on your web server – for example if you enable the option in Wordfence to “Scan files outside your WordPress installation”.

But we go further than that – we can even treat image files as if they are PHP executable files and do a deep scan on those if you enable the option to “Scan image files as if they were executable”. This lets us catch those nasty infections that hide executable source code in files named to appear as image files.

We also have full access to your database so we can scan database table structure and table contents. Some of our customer sites have over a quarter million approved comments in their WordPres database, so this lets us rapidly scan those comments for malicious code. Imagine doing that by accessing every page that contains those comments – it would consume the resources to render an entire page for each page accessed.

If you’ve watched the video on our home page, you’ve probably heard the narrator mention “remote scanners are better than nothing”. I asked our producer to put that in there because I’ve always taken issue with remote scanners.

Lets use a metaphor: Imagine you ask someone to check your home for a rat infestation. They arrive at your house, but they don’t get out of their car. They’re parked on the other side of the street and they’re examining your front door, front garden, porch, the walls on the front of your home, parts of the basement windows that they can see. Once they don’t find anything they honk the horn, shout out the car window “Yo, your home is clean” and drive off. Doesn’t sound very effective does it?

Well that’s the moral equivalent of what remote scanners are doing when they scan your website. They are only seeing what all other visitors see “from the street”. In other words, remote scanners are only able to see the final rendered version of your website which appears as HTML code, javascript, CSS and HTML. They have zero visibility on the source code that actually generates that website or the database that stores the data that is used to generate your website pages.

Remote scanners don’t even know what your internal directory structure looks like so to find all the pages on your site they have to do a googlebot-like crawl of your site. This generates a large amount of load on sites with many pages and even after they’ve done the crawl there is no way for them to be sure they have scanned every page and URL on your site. They can’t be sure they have scanned every comment. They don’t know if they got every post. And they definitely didn’t take a look at any of your PHP or other executable server source code.

If a remote scanner does not generate a large number of page requests, then it’s probably only doing a very simplistic scan like taking a quick look at your home page and any included code for infections.

Because Wordfence is able to examine server source code we do some pretty cool stuff like compare your core, theme and plugin files to what exists in the official WordPress repository and tell you what has changed. Then we let you do a “diff” to actually see a syntactically highlighted visual of what the changes in each file are.

Then we go on to scan for malware, malicious URL’s, known infection heuristics and much more.

Wordfence is the only service that is designed specifically for WordPress security and that does a complete scan on all your theme, plugin and core files and does a deep scan on all other files for malware and infections. We’ve been providing core, theme and plugin verification for over 3 years since our 1.1 release (Current version is 5.2.7) and we’ve learned a lot about efficient scanning and WordPress security and made many improvements to make our scan faster and more accurate.

I hope this short description of the differences between remote and source code scanning has helped you gain a better understanding of how to verify that your site does not have an infection and to improve your WordPress security.

Regards,

Mark Maunder – Wordfence Founder.

 

Did you enjoy this post? Share it!

Comments

6 Comments
  • Good to point out these features as I tend to overlook. This has been and is still one of the best security plugins for WP. I have run all sorts of other security plugins and have still gotten hacked. Wordfence + Cloudflare keeps my sites safe. I will start using the on site scan.

    thanks

  • I actually feel safer and see more clearly the benefits to source code scanning and the value of WordFence - Thank you!

    I have one question regarding what you have mentioned here regarding the scan option for files outside the WordPress installation. What if I have several WordPress installations in a single hosting account, each with a copy of WordFence, and each set to scan files outside the WordPress installation. Would this mean that each instance of WordFence is scanning all of the other copies of WordPress? If so, is it your recommendation to disable this feature. It seems like it would be a considerable resource drain if, say, ten copies of WordFence were each scanning its own plus nine other copies of WordPress.

    Again, thanks, and I've enjoyed your community communication over the time I've had WordFence installed.

    • Hi Tom,

      Short answer: Yes disable it if you already have Wordfence running on the other sites.

      Longer answer: We don't do core, theme and plugin verification on those other directories. We are scanning them for malware, bad URL's, known virus/infection signatures and so on. So we treat them as if they are not WordPress installs, but simply more files in your existing installation. So the CPU usage is less, but if you have a large site it can be significant.

      Also we've run into issues where the scanner will fall down a rabbit hole following a circular symbolic link or will scan device driver directories which can be problematic. So unless you're sure you need that feature I'd leave it disabled. The only time we usually recommend it is if you're already infected and are trying to clean your site.

      Regards,

      Mark.

      • My main site (see above) runs from the /public_html/ root and I have 5 other sites running from sub_directories - would all six sites be effectively scanned by enabling 'scan outside wordpress folders'?

        At the moment I have a copy of Wordfence running in each with the 'scan outside' option disabled. Each of these sites are running as individual domains and not as part of a WordPress multi-site installation. (I have often thought about using muti-site but, frankly I am nervous about making the transition on sites that are live and stable - the don't fix what ain't broken adage!).

        Great product by the way!

  • I was able to understand what you just wrote. Too many other sites are too techie for me, but your English is understandable. Thanks for this explanation.

  • This post makes me wonder why remote scans are included as a Wordfence premium feature?

    I have made sort of a hobby out hunting down and reporting pharma hacks. At first I was running into some webmasters who would email me telling me they ran an external scan and that their site was
    fine. I found I had to send them proof that the hack was there.

    I would bet that hackers test their hacks on external scanners to be sure they cannot be detected.

    I've tested hundreds of sites on external scanners. They catch less than 20% of the hacks. When they do catch a hack it is usually because the site was blacklisted or had been hacked more than once. All of the hacks that I have looked at have what is basically an independent site hidden within the real site with no linking between the it and the real site. Crawling the site will not expose the hidden site

    For Black Hat SEO hacks like the various Pharma hacks I doubt if there is anything better that Google's "site" operator combined with spammy words to begin your search for hacks. After locating the URLs created by the hack in Google search results you can look at Google's cache of these pages or you can use a browser capable of having the user agent set as a Googlebot to see the pages on the site.

    This still does not show you the code that redirects pages or the code that creates back doors or automatically recreate the hack after it is removed.

    I do wonder if your internal scanner checks the .htaccess file? Many of the hacks that I have seen have added code to the .htaccess file.