Introduction to Writing Secure PHP Code
The WordPress Security Learning Center
Introduction to Writing Secure PHP Code

2.1: Introduction to Writing Secure PHP Code

Advanced
Updated June 25, 2018

If you write enough code, you will accidentally write a vulnerability at some point in your career as a developer. This section of the Wordfence Learning Center is designed to help you as a beginner or advanced level developer reduce the probability that you will release a vulnerability into production.

We will do this by starting with a conceptual overview of PHP security. We will then explore types of vulnerabilities with examples. In later sections we will provide you with a few tools to help you detect vulnerabilities in your code. We recommend you go through each article in order to build a solid foundational knowledge that you can build on as you progress. The time you invest here will benefit you and users of your web applications for years to come by helping you create more secure PHP applications.

Much of the advice we provide is applicable to PHP development in general, but we do include functions and examples in these articles that are specific to WordPress development.

Even if you are not writing WordPress applications, we encourage you to read these articles because they will provide you with an excellent foundation for writing secure PHP code. If you later decide to join the WordPress community and write your first plugin, you will be able to produce high quality secure code that will gain you friends in the community and make your code more valuable to employers and your customers.

This introductory article will provide you with a conceptual understanding of how vulnerabilities are introduced into PHP code in general and WordPress plugins in particular.

Where and Why Vulnerabilities Appear in WordPress

WordPress as a publishing platform consists of WordPress core, which is the core content management system. This consists of PHP code, HTML, javascript code and CSS rules. WordPress core’s code is inspected often by the open source community. As Eric Raymond says in the Cathedral and the Bazaar,

“Given enough eyeballs, all bugs are shallow”

In other words, there are so many people looking at WordPress core these days, it’s unlikely that someone won’t notice a bug or vulnerability, report it and take credit for it.

A WordPress site also includes a theme which is mostly HTML, CSS and javascript and usually includes a small amount of PHP code. Sites don’t change their themes often once they’ve installed them, so new vulnerabilities or bugs are not likely to be introduced into a site’s theme. Themes don’t usually contain much PHP code and the code complexity is low.

With just WordPress core and a theme, you have a basic site that provides the usual functionality that WordPress includes. So far you haven’t introduced much new PHP code into the system. By adding WordPress plugins, a website gains complex new features and behaviors. WordPress plugins may include:

  • Comment spam filters like Akismet
  • E-commerce platforms like WooCommerce
  • Auction systems to turn your site into a kind of eBay
  • Security plugins like Wordfence

There are over 50,000 plugins in the WordPress official plugin repository alone with over 1 billion total downloads. They provide a huge range of functionality to WordPress sites.

WordPress plugins include a large amount of PHP code – in fact they are mostly PHP. The level of complexity of the PHP code in a plugin is also high. This makes them more likely to contain vulnerabilities.

Most vulnerabilities are introduced into WordPress via plugins. You can see this in action if you view WPVulnDB.com, which maintains a database of WordPress vulnerabilities. As you can see there are a handful of vulnerabilities reported every year in WordPress core and in themes, while plugins provide a constant stream of vulnerability reports.

Because plugins are by far the biggest source of new vulnerabilities in WordPress, we will spend some time in our guides on writing secure PHP code focusing on WordPress plugins and how to improve their security.

Basic Principles of Writing Secure PHP Code

Never Trust User Input

If you can memorize the above line “Never Trust User Input” and incorporate it into your daily coding practices, you are already halfway to writing more secure PHP code. The majority of vulnerabilities in PHP code are caused by a developer that did not properly mistrust user input. In other words, the developer did not include code to correctly or sufficiently sanitize some form of user input.

In the Akismet vulnerability reported in October of 2015, Akismet was not correctly sanitizing user input via comments, which led to a Cross Site Scripting vulnerability (an XSS).

In August of 2015 a vulnerability in WordPress core was discovered where WordPress core was ‘trusting’ user input to provide a valid post ID, without verifying it. A ‘subscriber’ level user could use an invalid post ID (along with a race condition) to elevate their privilege level to a higher access level. If core had been correctly checking whether a post existed before checking if a user had the correct level of access, the vulnerability could have been avoided. We encourage you to read the vulnerability disclosure in the latter case because it will give you a good idea of how closely your code may be scrutinized by a researcher. [Hint: VERY closely. This was an extremely advanced vulnerability]

The most recent 7 plugin vulnerabilities at the time of writing this are all caused by incorrectly trusting user input. They are either XSS, CSRF, RFI or SQL Injection vulnerabilities, all of which are caused by a developer not correctly sanitizing user input before using it in the application or by not sanitizing output before it is sent to the web browser.

Remember this saying: “Sanitize input early, sanitize output late”

Our applications are not very useful without user input in the form of comments, blog posts, star ratings, forms that visitors fill out and so on. When data arrives in your web application from a site visitor’s browser, it needs to be sanitized as it arrives. You must ensure that you sanitize it as soon as it arrives or as early as possible before other parts of the application interact with the data.

Our web applications are also much more interesting when they can share user input with other site visitors by sending it back to a web browser. We might display comments, show published posts, share the results of a survey with other site visitors and so on. All this data is stored user input that is being output to the browser. Even if you’ve sanitized this data as it arrives, it needs to be re-sanitized when it is displayed to other site visitors.

When you sanitize output, you need to sanitize as late as possible. That way you can be sure that it is not modified after it is sanitized and you only sanitize it once: right before it is sent to the web browser. That is why we invented the saying above: Sanitize input early, sanitize output late. Use this as a reminder you need to sanitize user data once as soon as it arrives and again right before it leaves.

Sometimes you don’t control input

Sometimes you might be receiving user input data from an API or a data feed into your application. You might be relying on another application to sanitize the data for you. Ideally you would re-sanitize any data that arrives in your application, but this is not always feasible for performance reasons or because the data is large and complex and it would require a lot of code to sanitize it.

In this case you need to remember to, at the very least, sanitize the output. It’s easy to forget that data you are receiving from somewhere other than your own application might also be user input data, and might contain malicious code.

Sometimes you don’t control the output

Occasionally you might receive data from users on your website and send it to an external application via, for example, a REST API that you have published. Make sure that you sanitize all user input early, as it arrives in your application before you send it out via the API.

By doing this you help keep users of your data safe. You should not assume that developers using your API are sanitizing data on their end and that you can send them raw user input.

How to Sanitize, Validate and Escape Input

In the above discussion, we use the term ‘sanitize’ or ‘sanitization’ as a global term to describe the idea of making sure that your data arriving at an application is safe for the application to interact with, and data leaving (being outputted) is safe for consumption.

There are three ways to make sure data is safe:

  • Validation: Validation makes sure that you have the right kind of data. For example, you might make sure that a field specifying a number of items in a cart is an integer by using PHP’s is_numeric() function. If it returns false then you send an error back to the browser asking them for a valid integer. When you test input for valid data and return error messages to the user, that is validation.
  • Sanitization: This removes any harmful data. You might strip out <script> tags from form data. Or you might remove quotes from an HTML attribute before sending it to the browser. This is all sanitization because it removes harmful data.
  • Escaping: This takes any harmful data and makes it harmless. For example, you might escape HTML tags on output. If someone includes a <script> tag which can result in an XSS vulnerability, you might output it as &lt;script&gt; where you have escaped the greater and less than signs to make it harmless.

Validation routines are normally used in a conditional statement e.g.

if(filter_var($address, FILTER_VALIDATE_EMAIL)){ 
 echo "Email is valid."; 
} else { 
 echo "Not valid."; 
}

Sanitization takes some data and cleans it for you, returning the clean version. e.g.

//Remove all characters from the email except letters, digits and !#$%&'*+-=?^_`{|}~@.[] 
   echo filter_var($dirtyAddress, FILTER_SANITIZE_EMAIL);

Escaping routines make potentially harmful data safe. They are frequently used as follows:

<?php
//Do some stuff that makes sure it's time to write data to the browser
?>
Thanks for your order. Please visit us again. You ordered <?php echo esc_html($productName); ?>.

When to Sanitize, Validate and Escape

As we mentioned above, to ensure that your code and your application users are safe, you need to make sure that your data is safe when it arrives and when it leaves. That means you need to perform checks at input and output.

At input: Validate and Sanitize

As data arrives your first step should be to validate it. Make sure integers are in fact integers and that no unusual or disallowed data is arriving in your application. The next step at input is to sanitize it and strip out anything potentially harmful. You will rarely escape data at input because your application will most likely need to work with the raw data, and you have already made it safe by validating and sanitizing.

At output: Sanitize and Escape

As data leaves your application, you need to remove any potentially harmful data again through sanitization. The reason you sanitize again on output is because a hacker may have tricked your application into creating harmful data for output, so you need to re-check that your output data is safe.

Then you need to escape the data to make sure it is suitable for whatever medium it is being output to. You may need to turn HTML tags into HTML entities to make them safe for the web browser. Or you may need to remove single and double quotes if your output is going to be used as an HTML attribute.

Output Vectors and Vulnerabilities

Most people think of output as writing from a PHP application back to the web browser. But there are different places data leaves your application and they are closely related to the kinds of vulnerabilities that your code can introduce into an application. We discuss the different kinds of output here. We have named these ‘output vectors’ because the phrase ‘attack vectors’ is used to describe different entry points into an application. ‘Output vectors’ we feel is an appropriate term because where you write your data is closely tied to the vulnerabilities or ‘attack vectors’ you introduce into your application.

The Visitor’s Browser

The most common place data is sent to by a PHP application is to a site visitor’s web browser. This is trivial to do in PHP using the ‘echo()’ function. Because it is so commonly used and so easy to do, it also introduces the most common form of vulnerability in web applications: The Cross Site Scripting, or XSS vulnerability.

Here is how easy it is to write an XSS vulnerability:

<?php
echo "You visited my URL with the following parameter: " . $_GET['value'];

If you visit a web page with the above code, and use a URL as follows:

http://example.com/?value=<script>// <![CDATA[
alert('1');
// ]]></script>

You will see an alert box appear. This simply proves that you can execute javascript code fed to the application. To avoid this vulnerability in a WordPress plugin, you should have done the following:

if(is_numeric($_GET['value'])){
echo "You visited my URL with the following parameter: " . intval($_GET['value']);
} else {
echo "Hey. Stop trying to hack me by sending non-number values!";
exit();
}

As you can see, we are first validating that we received a number as it arrives in the application. Then we sanitize before output by stripping out anything that isn’t a number before sending the data back to the web browser. We will go into more detail on XSS vulnerabilities in a later section.

The Database

Another place data exits your application is into the database. A database is a fully functioning application in its own right that can respond to commands from your application. If you allow a visitor to your website to send anything they want to your database, they could persuade your database to give them all of your data, which would be a disaster for your site members’ privacy.

For this reason you need to make sure that any data sent to your database is safe. The most common attack on your database is a SQL injection attack. This is a way for an attacker to send arbitrary commands to your database to either add or update data in an unauthorized way, or read data they should not have access to, like passwords or member email addresses.

Files

Many PHP applications and WordPress plugins write data to files. If an attacker can trick an application into writing PHP code into a file with the correct name, they can then execute that file and gain full access to your website. For this reason it’s important to make sure that data being written to a file is safe, and the filename being used is safe too.

One of the most famous vulnerabilities in WordPress was the TimThumb vulnerability that fetched images from the web and stored them as files on a website. An attacker could trick a WordPress plugin to fetch a PHP file instead and store that on the filesystem of the website. The attacker then visited the PHP file and it would execute. Using this technique, the attacker could get the website to download malicious PHP code and then execute that code.

The problem with the TimThumb vulnerability was that the application never validated and sanitized the contents of the file it was fetching. It also never sanitized and escaped the data it was writing to the file when it saved it on the website’s filesystem. And furthermore, it never made sure that the filename being used was a non-executable filename. As you can see, if the developer was validating, sanitizing and escaping correctly at input and output, they would have had several opportunities to catch this kind of attack.

Shell Commands

A shell command is another data output vector in your application. It is a place where you could potentially output user-data which may allow an attacker to trick your application into executing undesirable shell commands.

It is unusual to execute shell commands from a PHP web application and in general we recommend against it. Instead use built-in PHP functions to do things like directory listings, file manipulation, text searching in files and so on. Very occasionally, shell commands are unavoidable. If you are executing a shell command, we strongly recommend against including any user data or data that has arrived from an external source.

If you absolutely must execute a shell command in PHP that involves external data, you should use very strict validation, sanitization and escaping. Functions like ‘intval()’ that strip out everything except integers are useful for sanitization in this scenario.

The End of The Beginning

This brings us to the conclusion of our introduction to PHP security. We haven’t looked at much code yet. This was a conceptual introduction to help you understand how vulnerabilities are introduced into an application, how they are avoided and to which areas of your application you should be paying attention. We go into more detail in the coming sections.

Did you enjoy this post? Share it!

The WordPress Security Learning Center

From WordPress security fundamentals to expert developer resources, this learning center is meant for every skill level. Get serious about WordPress Security, start right here.