Is it better to make user-input safe before saving it to the database or is it enough to make it safe for display using functions like htmlentities or strip_tags? For example, I might save what user inputs and then display it later using something like the following:
echo "<h2>" . htmlentities($select_page["menu_name"]). "</h2>";
echo "<p>" . strip_tags(nl2br($select_page["content"]), "<p><br><b><i><a>") . "</p>";
Should I be performing this kind of transformation before saving to a database?
(I know enough to be using mysql_real_escape_string before saving to mysql).
In a word, no.
Store the data as is in the database, then apply whatever transformation are required for display upon output. This allows the data to be used in other areas/way throughout your application.
strip_tags is an appropriate sanitization function to run on any input field that you want to make sure does not contain any HTML. It should be run long before the data gets to the database.
htmlentities is an output routine for maling sure that < & etc all display correctly when your plain text is displayed as part of a web page. It should be run just prior to displaying the web page long after retrieving the data from the database.
If you use prepare/bind with either PDO or mysqli then you will not need mysql_real_escape_string as keeping the SQL and data separate avoids even the slightest possibility of SQL injection and so will prevent even those attacks that mysql_real_escape_string lets through.
Essentially you need to make sure that you Escape Output to protect the next recipient of that output.
If the next recipient is going to be a webpage, then htmlentities and that family of functions is the correct one to use.
If the next recipient is going to be your database then mysql_real_escape_string, or as mentioned, preferably using prepare/bind are the correct methods to use.
So, you might escape (prepare and protect) for your database and insert it, but then you have to remember to escape it correctly when you get it out of your database and display it on a webpage, or as xml, or csv, or pdf and so on.
Until I understood the FIEO concept (Filter Input, Escape Output) I recall that I was always trying to filter out everything upon receipt of data - and sometimes of course you can.
An example of this might be a phone number.
If your phone numbers should only ever be the numbers 0-9, between 8 and 10 numbers long, then, yes, you could quite easily sanitize and cleanse (Filter) that data before putting it in your database.
The problem is that further down the line you may not remember just how zealously you Filtered the text going into a field, so you should always Escape your Output to be on the safe side.
When it comes to a free-text input field you cannot hope to Filter out every bad thing that a malicious person could enter into a textarea box (take a look at the scary xss cheat sheet). Frankly if you have to strip_tags() on that input then something is amiss. If you have told users they cannot enter html tags and they do, then the mere presence of a < may well cause you to abort the operation.
If you have not told users they can enter html tags, then why unexpectedly strip it out?
This is a lot of great information. It sounds like I should look into using PDO, although I don't know much about it.
When you (Cups) say that "if you have to strip_tags() on [a textarea box] then something is amiss," what do you mean? What would be the right way to handle a textarea box?
What I mean is that if your GUI tells the user that they can use html, then you should not be stripping it out - well not for security reasons anyhow.
Now, and here we go, if you say that the user CAN use SOME html tags, then you need a tool like htmlPurifier which has a white list of tags that you say you do allow and kills the rest.
Even then, as stated, you will need to then go on and escape your data in readiness for the next recipient.
Yes, please do look at PDO, come back if you have any questions about it - I found it frightfully hard to grasp I will admit, but compared to what we had to do prior to PHP5, it is a breeze.