Removing Spam Pages From WordPress Sites
The WordPress Security Learning Center
Removing Spam Pages From WordPress Sites

3.3: Removing Spam Pages From WordPress Sites

Updated March 8, 2018

What is a Spam Page?

Spam pages are files added to your publicly available web site with the intent of manipulating search engine result pages. The more inbound links a site receives, the higher the placement of the target web site in the search results. Inbound links from sites with high reputation ranking are even more valuable. Sites with older domain names, are .edu top level domain names, or highly popular sites that are updated often are the most desirable targets. As such, attackers often use these sites to build complex link networks for highly competitive search phrases.

Determining if your site is infected

Site owners often do not know their site is infected with spam pages until search results for their site start showing odd results that are unrelated to the site’s content. An examination of the site and the search engine result pages will clearly show if there are spam pages placed within the site files or the database. Spam pages can be found as stand alone files or added as posts or pages within your database.

Finding and Removing Spam Pages

Removal of spam pages requires analysis of all publicly available files on the web server. As they can also be placed in your database, you will need to review everything to find and remove those pages. First create a backup of site files and database.

Spam pages may be related to any number of highly competitive search niches including:

  • Pharmaceutical sales
  • Essay writing sites
  • Ringtones and music downloads
  • Movie downloads
  • Online casino or gambling
  • Fraudulent/replica designer sales
  • Weight loss supplements or products
  • Adult content

Stand Alone Spam Pages

Look for unfamiliar directories either outside of your content management system or hidden within subfolders of an administrative directory. Determine which files are not relevant to your site and remove them. Directory/folder names can appear to be functional like “headers” or “stats” or they can be nonsensical like “a3051.” The files may be html files or obfuscated (intentionally obscured to make code ambiguous) and appear to be jibberish. There are usually thousands of files.

Htaccess file review

Most sites have an .htaccess file that gives the server directives on which pages to serve to the site visitor. If you have spam pages on your site, there may be code inserted in the htaccess file that will direct site visitors to certain pages based on results on the query string. The query string is a bit of code that is added to the end of your site’s index file.

If there are directives placed on the query string, telling the search engines to look for specific spam pages instead of directing the user to site content. Removing querystring directive spam can be challenging as it requires removing this from the htaccess file and then adding additional code to htaccess that tells the search engines that those pages are now gone. This can be complex, and if you are uncertain of working with regular expressions and htaccess directives, we recommend getting help with this.

Here are some examples of spam within htaccess files.

RewriteRule . - [E=REWRITEBASE:/]
RewriteRule ^b(\d+)[-/].*[-/]p(\d+)-.*$ index\.php?id=$1-$2&%{QUERY_STRING} [L]
RewriteRule ^b(\d+)[-/]p(\d+)[-/].*$ index\.php?id=$1-$2&%{QUERY_STRING} [L]
RewriteRule ^p(\d+)[-/].*[-/]b(\d+)[-/].*$ index\.php?id=$2-$1&%{QUERY_STRING} [L]
RewriteRule ^p(\d+)[-/]b(\d+)[-/].*$ index\.php?id=$2-$1&%{QUERY_STRING} [L]

Spam Pages within the Database

Spam pages within the database are usually fairly easy to remove. They are added as either pages or posts in the content management system database. You can often see them as pages or posts that you did not add, so you can easily delete them. They are most often added by a compromised administrative account, and your content management system will often tell you which account used to add the spam pages, thus which account was compromised.

Working with the search engines

Removing the pages from your site is not enough. The search engine result pages often lag behind your site’s content by days, or sometimes weeks, depending on how often the search engine bots come by to visit your site.

Quickly clearing spam pages from the search engines, notably Google, requires adding your site to Google search console. After you have cleared all of the spam pages from your site, add your site to Google search console. Add a sitemap to your site, submit that sitemap via the search console, and perform a fetch on the spam pages to ensure that they respond as a 404 (not found). You can then submit your site to get indexed. It is then a waiting game. The sitemap and updated content are most helpful in ensuring that your search results return to normal.

Looking Beyond the Spam Page

Spam pages are placed on the site through exploitation of some vulnerability on the site, either through backdoors, unpatched site code, or compromised administrative, FTP, or other accounts.

If you find spam pages on your site, it is important to determine how those pages were placed. There may be other types of malware or security vulnerabilities on your site that allowed an attacker to gain access. A review of the entire site is important.

If after reading this guide, you are unsure of how to remove spam pages, if you are looking for more answers as to how the spam pages were placed on your site, or if you need assistance ensuring that all spam results are removed from the search engine result pages, get help.

Did you enjoy this post? Share it!

The WordPress Security Learning Center

From WordPress security fundamentals to expert developer resources, this learning center is meant for every skill level. Get serious about WordPress Security, start right here.