Escaping data in PHP pages

What does it mean to “escape” data that is put into a web page?

By escape I mean formatting the data in such a way that it doesn’t interfere with any of the code or markup of a web page. For example, let’s say a user entered their first name like so into a text input box

<b>Mary</b>

If the web page redisplays this data after the form is submitted, and it doesn’t escape it, then it will be rendered something like

<p>First Name: <b>Mary</b></p>

which will display as

First Name: Mary

instead of as

First Name: <b>Mary</b>

That is, as a web developer you want to display to the user exactly what they provided. The way to make sure you display to the user exactly what they provided is to escape any characters that could be interpreted as part of the web page markup, like so

First Name: &lt;b&gt;Mary&lt;/b&gt;

which will render as

First Name: <b>Mary</b>

Why is it important to escape data included in a web page?

There are two main problems that can arise when data isn’t escaped in a web page. We’ve already looked at one, namely that characters that are part of the markup or programming language (as we’ll see with JavaScript later) can mess up that actual markup or program code. So for the sake of correctness, data included in the web page must be escaped.

The second problem is that failure to escape data in a web page can make possible various kinds of security attacks. Keeping with our example, suppose a user entered as their first name

Mary<script>alert("Hijacked!");</script>

Displaying this directly in a web page would result in that script being executed. And the JavaScript so injected could do lots of nefarious things, such as reading a user’s cookies.

What are the different ways of escaping data?

There are several different ways in which data is added to a web page, and they require different techniques to properly escape them. I’ll cover three cases in this blog:

  • HTML data
  • JavaScript strings and JSON
  • URL components

For each of these I’ll show how to escape these in PHP (specifically when using Laravel), but the techniques here are applicable to other languages and frameworks. Most languages and frameworks will have similar utilities to escape web page data.

How to escape HTML data

For the most part what we need to do is replace the following characters

  • < to &lt;
  • > to &gt;
  • ” to &quot;
  • & to &amp;
  • etc.

In a Laravel controller you can escape data to be put into a web page with the htmlspecialchars. In a Blade template you can escape data by using triple curly braces, for example, {{{ $first_name }}}.

To go along with the example we started with, here’s how to write the Blade template to display the first name:

<p>First Name: {{{ $first_name }}}</p>

If you are dynamically generating a web page using JavaScript then you also need to be careful when creating DOM elements. Don’t use .innerHTML because that won’t escape any HTML special characters. There’s a simple way to handle data that you want to put into the DOM that may have special HTML characters and that is to use the special .textContent property of the DOM element.

var p = document.createElement('p');
p.textContent = '<b>Mary</b>';
// then append the p somewhere in the DOM

You can do the same in jQuery with the .text() function.

How to escape JavaScript strings and JSON data

With JavaScript strings you want to make sure that you escape quote characters in the string. If the user’s first name value is Mary" and you put this directly into a JavaScript string like so

<script>
var firstName = "{{ $first_name }}";
</script>

it will get rendered as

<script>
var firstName = "Mary"";
</script>

which is invalid JavaScript. The double quote in the $first_name variable prematurely terminates the JavaScript string.

In PHP, the way to deal with this is to use the json_encode method. You can use this for JavaScript strings:

<script>
var firstName = "{{ json_encode($first_name) }}";
</script>

You can also use it to encoding PHP data structures like arrays to JSON:

<script>
var userData = "{{ json_encode($userdata) }}";
</script>

How to escape values placed in URLs

When placing data into a URL, care has to be taken to encode any characters that are special to URLs such as

  • forward slash (/)
  • ampersand (&)
  • question mark (?)
  • etc.

URL encoding is the process of converting these special characters using something called percent-encoding. For example / gets encoded as %2F. Notice that this encoding format is different from the HTML encoding above so you’ll need different functions and techniques to do percent-encoding.

Let’s say you have a $project_name variable with the value This&that project and you want to create a link with the project name in it:

<a href="/projects?name={{ $project_name }}">Go to {{{ $project_name }}}</a>

This renders as

<a href="/projects?name=This&that project">Go to This&that project</a>

But this won’t work. When someone clicks on the link the server will interpret the URLs parameters like so

  • name=This is the first parameter name and value
  • that project is another parameter without a value

Instead use urlencode to properly encode the URL data.

<a href="/projects?name={{ urlencode($project_name) }}">Go to {{{ $project_name }}}</a>

This renders as

<a href="/projects?name=This%26that%20project">Go to This&that project</a>

The server will interpret these URL parameters as

  • name=This&that project as the sole name value pair

If you are constructing URLs in JavaScript code, use encodeURIComponent to do percent-encoding on URL data.

Resources

Leave a comment

Leave a Reply

%d bloggers like this: