Htmlspecialchars and mysql_real_escape_string

stevecc · September 22, 2011, 4:29am

Is this the proper way to process user data into and from a database?

JeffWalden · September 22, 2011, 6:24am

Is your concern preventing XSS or to properly escape the data so it doesn’t blow up your SQL queries?

There was a great article written on this a little while ago: http://wonko.com/post/html-escaping

felgall · September 22, 2011, 10:29am

htmlspecialchars is an output function used for escaping problem characters when writing the data into a web page. It has nothing whatever to do with databases or validation.

mysql_real_escape_string was the 20th century output function for escaping data to be written to a database. It became obsolete when PDO and mysqli introduced prepare/bind so as to keep the query and the data completely separate. It also has nothing to do with validation.

Validation is an input function where you validate that what is in a given field is valid for that field to contain. If the field is supposed to only contain numbers then that’s what you validate it for. If it is allowed to contain symbols that could be misinterpreted as HTML then you allow those through the validation and into the database and only ever escape them if writing them into a web page.

Cups · September 22, 2011, 1:12pm

yes

stevecc · September 22, 2011, 9:31pm

felgall, thanks for setting me straight. don’t know where my head was… yes obviously binded params is the right way. can i ask about the appropriate usage of htmlentities / htmlspecialchars and urlencode in a typical php web app?

felgall · September 22, 2011, 9:55pm

Both htmlspecialchars and htmlentities do basically the same thing exccept that htmlentities converts a lot more characters to their equivalent HTML entity codes.

htmlspecialchars covers all the conversions you’d normally need as it is only really < and & that need to be converted to entity codes when displaying thext in an HTML page.

You would call htmlspecialchars at the point where you are outputting the text into the web pageand so would usually have:

echo htmlentities($fieldname);

Lemon_Juice · September 23, 2011, 9:23am

I wouldn’t worry about not using bound params, if you like them use them, if not you don’t have to. Prepared statements were not designed to prevent sql injections, it’s only a nice by-product. Personally, I found that preparing statements requires more typing on my part and introduces other inconveniences (like not having complete queries for debugging) so after some time of using them I dumped them. Using the right function for escaping is not a difficult thing to remember.

Therefore, I disagree with fellgall that escaping is the 20th century way of doing things… unless you use the old mysql extension instead of mysqli. If you use mysqli_real_escape_string() you are doing it perfectly OK and the diagram you posted is correct. Just remember to set your character set after connection with mysqli_set_charset() - this way you make mysqli_escape_string() completely safe against some rare exploits that are possible with injecting data in different character sets.

Just experiment with both mysqli and PDO and decide which one you like better. In either case, I think it’s good to create your own wrapper class around native PDO or Mysqli because using only built-in php functions and methods can be lengthly and inconvenient for frequent use. For example, my wrapper class allows me to do the following:


// fetch one row of the result set into associative array:
$row = $db->fetch_one_row("SELECT name, surname FROM users WHERE status=" .$db->quote($status));

Here, my own method quote() is short to write and handles all escaping, it detects the data type and NULL values and uses mysqli->real_escape_string() when necessary. Errors are handled by my wrapper class and this way accessing db is a one-line code in many cases.

felgall · September 23, 2011, 7:50pm

Except that you are still exposed to possible injections since you still have a jumbled mess of query and data that allows the possiblility of injection being able to occur. Using prepare/bind removes any possibility of injection.

It all depends on how secure you are trying to be. Injection is still theoretically possible if the data and query are mixed together regardless of how many layers of protection you build in.

Why allow a one in a billion chance of a security breach when an alternative exists that makes it impossible.

Cups · September 23, 2011, 8:59pm

Using the right function for escaping is not a difficult thing to remember.

And is by definition not a difficult thing to forget either… and is one of the key reasons for using prepared statements and mysqli or PDO, because we forget to do things.

Lemon_Juice · September 24, 2011, 7:55am

Can you explain how injections are possible when sending queries with data properly escaped?

(BTW, the jumbled mess, as you call it, is what SQL is. Much more meaningful than a query with mysterious placeholders. But that’s just my opinion.)

Well, you still have to remember to bind variables to the statement. How is that different from having to remember to escape values? In both cases you have to remember about something.

Cups · September 24, 2011, 10:52am

Because I have to make the absolute decision not to use bound parameters, bound params are the default.

You mention creating a wrapper class over PDO, well I do similar, as I am sure many others do.

This is typical of what I end up doing:


// where I have removed from $_POST the unwanted vars
// and satisfied myself that the remainder are within
// my tolerances
$this->BusinessesManager->addNew( $_POST );

Thanks to the POST vars containing both the name of the database field and the value, then it makes a lot of sense to keep them together and feed them to something that just inserts or updates the database.

I hardly write sql anymore.

I never worry about correctly escaping values going into my database because the default uses this PDO wrapper and its bound params, and as such sleep at night very well.

Perhaps you should review your situation.

Lemon_Juice · September 25, 2011, 11:59am

What do you mean by “bound param are the default”? When you construct an sql query it is you, the programmer, who writes it and decides to use bound parameters. Even when you use PDO and prepare a statement there is nothing that forces you to bind a param - you are still free to inject plain unescaped values into sql. It’s you who has to remember to put a placeholder in your query and then bind a value to it. If you don’t, your code is vulnerable to injections despite using PDO. Considering this, I can’t see much difference between remembering to bind params or escape values.

You mention creating a wrapper class over PDO, well I do similar, as I am sure many others do.

This is typical of what I end up doing:


// where I have removed from $_POST the unwanted vars
// and satisfied myself that the remainder are within
// my tolerances
$this->BusinessesManager->addNew( $_POST );

Thanks to the POST vars containing both the name of the database field and the value, then it makes a lot of sense to keep them together and feed them to something that just inserts or updates the database.

Now you are going a bit off-topic here. Yes, I mentioned using a wrapper class but as a general rule to make coding more convenient, it has nothing to do with using prepared statements vs mysqli_escape_string(). Even if inserting from $_POST looks as simple as you have presented above, somewhere deeper in your code your wrapper class has to prepare the statement and bind values. Who wrote the wrapper class? If you did then you had to consciously choose to bind values in the right place to avoid sql injections. It’s still the same as choosing to use mysqli_escape_string() - in both cases it’s not done by default, the programmer has to code it, even if it’s only in one place somewhere in a wrapper method for all sql queries.

I hardly write sql anymore.

I never worry about correctly escaping values going into my database because the default uses this PDO wrapper and its bound params, and as such sleep at night very well.

If you hardly write sql anymore then the difference between mysqli_escape_string() and binding params matters even less, because the protection against injection is handled automatically somewhere in your wrapper class and you don’t have to think about it in most cases. If you had used mysqli_escape_string() as a protection, then your code above would not have changed at all.

My point is simple: PDO doesn’t provide any protection “by default”, you still have to remember to bind params just as you had to remember to use mysqli_escape_string(). And I don’t see any difference in protection strength between these two (Felgall says otherwise, I’m still open to learn what’s vulnerable about sending sql with the use of mysqli_escape_string).