PDO API concerns

The database class of my framework is undergoing some expansion and revision. During testing I, on a lark, added these two methods to the main database object.


public function result( $sql, $params = null ) {
	return $this->prepare($sql)->parse($params)->result();
}

public function results( $sql, $params = null ) {
	return $this->prepare($sql)->parse($params)->results();
}

The custom methods being called on the statement object are…


public function parse( $params = null ) {
	if (is_array($params) && !hasAllNumericKeys($params)) { 
		$params = $this->prepareArray($params);
	} else if (!is_null($params) && !is_array($params)) {
		$params = array($params);
	}
		
	$this->execute( $params );
		
	return $this;
}

public function result( $params = null ) {
	if (!is_null($params)) {
		$this->parse($params);
	}
		
	return $this->columnCount() == 1 ?
		$this->fetchColumn() :
		$this->fetch( \\PDO::FETCH_ASSOC );
}	
	
public function results( $params = null ) {
	if (!is_null($params)) {
		$this->parse($params);
	}
		
	return $this->columnCount() == 1 ?
		$this->fetchAll( \\PDO::FETCH_COLUMN ) :
		$this->fetchAll(\\PDO::FETCH_GROUP|\\PDO::FETCH_UNIQUE|(  $this->columnCount() == 2 ? \\PDO::FETCH_COLUMN : \\PDO::FETCH_ASSOC)); 
}

So, for the most part, this is a shortcut meant to smooth out some wrinkles. Consider this PDO only statement

$pdo->query("SELECT `name` FROM `states` WHERE `id` = 16")->fetchRow(PDO::FETCH_ASSOC));

While that gets you a result, it’s wrapped in an array. Also, the class constant is something to memorize. The above allows…

$pdo->result("SELECT `name` FROM `states` WHERE `id` = 16")

Which might be too easy to be honest. If query parameter is coming from the outside world at all this is better.

$pdo->result("SELECT `name` FROM `states` WHERE `id` = ?", $_GET['param'])

The difference is very slight - but the second statement resists most attempts at SQL injection. I’m worried about encouraging vulnerabilities. Am I being paranoid, or is this a legitimate concern.

Note - the classes this comes from extend PDO, but native PDO functions are unchanged and just as available as ever for the edge cases where they will be useful.

I did something similar to that with an API I built a while back for a specific client. I ran it through all the scenarios I could think of to try and produce a SQL Injection with zero success. Granted, the client must continue to pass the $params instead of hard coding it in the SQL, but what can you do?

Well, this is a nice shortcut method but to be 100% correct you would also need to have a way to tell PDO what type your parameter is - in many cases you can get away with sending everything as string but there can be cases when if you send a number as string then you will experience performance penalty. So in fact you should be doing something like this:

Probably you will also want to pass more than one parameter so this may grow even more:

The data type could be inferred with an is_int() call on the variable. Or better, if you’re needing that level of precision why the hell are you using the shortcut method???

This won’t work if you have a BIGINT value that exceeds PHP’s allowed integer range and this won’t work if you have a large DECIMAL number whose size or precision exceeds PHP’s float range.

Well, in this case it’s not me but you who wants to use the shortcut method. I’m just pointing out some weaknesses of it that might show up in some rare cases. It’s up to you to decide how much level of precision you need.

So… unless I’m mistaken… since PHP numbers may not be able to represent a bigint or a float, then we would, in fact, have to pass those values as strings, correct? And we’d let the DB do the type conversion.

I’m also a bit dubious that having the DB rather than PHP do the type conversion would cause a “performance penalty.” Do we have a better source for this information besides some guy on SO? Does the DB documentation call this out as a performance issue? Or is there a repeatable benchmark that demonstrates it?

Emphasis mine. Good frameworks aren’t supposed to be written to cover the last 10% of use cases. That’s the programmer’s job. Ruby, Django, they don’t cover that last 10%. Cake, Symfony et al try - all the major PHP frameworks try - and that is why they fail. It makes them untenable bloated messes.

Partly yes, PHP has to store the number as string but passing to the db should be done as int using PARAM_INT setting. In this way the db won’t have to do any conversion - at least in theory only, unfortunately - I’ve explained it below.

You can do your own benchmark. My benchmark shows this:


SELECT count(*) FROM table WHERE price>19696552.00;
SELECT count(*) FROM table WHERE price>'19696552.00';

The first query completes in 0.25 seconds, the second in 0.3 seconds and this result is repeatable. price is indexed DECIMAL(12,2) column and I have filled the table with 800,000 records of random price values. The table is innodb and also has one primary auto-increment integer key and one foreign integer key. Without the index on price the result is similar but proportionately takes longer.

I don’t have any other examples because it would take some time to find the types of queries that are affected, possibly I could find ones where the difference is much bigger but I don’t have time nor intention to search for them. One thing is certain - wrong datatype can affect performance, however it doesn’t happen often. To be on the safe side, I never send numbers as strings.

Another issue is - how to send the number properly via prepared statements (native mysql prepared statements to be clear)? Let’s take the same query:

$st = $pdo->prepare("SELECT count(*) FROM table WHERE price>:p");

Now this gets tricky and interesting, because my tests reveals the bugginess of PDO. See the comparison between various kinds of binding values in PDO for the query above with execution time below each one:

$st->bindValue(':p', 19696552, PDO::PARAM_INT);

0.25 s. - certainly the value is sent as a number like an unquoted parameter. Good.

$st->bindValue(':p', 19696552, PDO::PARAM_STR);

0.3 s. - the number is sent as string and it takes longer.

$st->bindValue(':p', 19696552);

0.3 s. - no type means string even if the value is integer in PHP.

$st->bindValue(':p', '19696552', PDO::PARAM_INT);

0.3 s. - something’s wrong here, I explicitely told PDO this should be INT but it send the value as string just because the value is string in PHP.

$st->bindValue(':p', 19696552.00, PDO::PARAM_INT);

0.3 s. - this also doesn’t look right, the number is sent as string.

Now let’s suppose I want to use a floating point number, how do I bind it properly?

$st->bindValue(':p', 19696552.01, PDO::PARAM_INT);

0.3 s. - sent as string (BTW, did you know that despite PARAM_INT the value is actually sent with the fractional part? It is not converted to INT at all!)

$st->bindValue(':p', '19696552.01', PDO::PARAM_INT);

0.3 s. - sent as string

$st->bindValue(':p', '19696552.01', PDO::PARAM_STR);

0.3 s. - sent as string

$st->bindValue(':p', 19696552.01, PDO::PARAM_STR);

0.3 s. - sent as string

Something is wrong here as this proves there is no way to send a fractional number value as a numer via PDO prepared statements. There is no way to make the statement run as fast as the plain query where the parameter is included in the query string without quotes. The only solution with prepared statements is to fall back to string type for fractional numbers. And for BIGINT numbers as well.

Now anyone says prepared statements are faster? :). I agree that they should be. I know this problem will not happen always so it’s not an argument against prepared statements but this is why I’ve never been fond of PDO since it provides nice interface but does all kinds of weird stuff under the hood and is clearly buggy.

By not covering the 10% they simply hide the problem instead of giving the programmer the tools to overcome it. If they hide the problem then many programmers will think there is no problem at all and they can send values to PS without worrying about their type. Yes, in most cases they can and this makes coding faster but when a performance problem arises it’s very hard to figure out what is causing it.

Still, mysql seems to be the most flexible databse in accepting incompatible data types in queries without losing much performance. But there are other databases where this may mean failing to use an existing index or even rejecting the query. I don’t want to say more about it because I don’t have enough experience here but from what I’ve read from other db experts the correct data type matters more in other databases.

It seems you’re right. I would have expected MySQL to reuse a casted value, but based on the benchmark results, that doesn’t seem to be the case.

For what it’s worth, it seems that if we explicitly cast the value, then we can avoid this issue.

SELECT count(*) FROM t1 WHERE price > CAST(:p AS DECIMAL(12,2));

If we do this, then it doesn’t seem to matter what kind of value we pass through PDO.

No, that’s not the case and if combined with PDO’s inability to send a value as a decimal/float this can create unnecessary overhead for reasons least expected by anyone. I thought I could trust PDO to do the right thing but that’s not the case. BTW, I wonder how a similar benchmark would turn out with other databases like PosgreSQL, Oracle, SQL Server. According to some other info I have found it could be quite different.

What a clever trick! By benchmarking I can confirm that this is the way to perform the query fast, the conversion seems to be done only once in this case and does not noticeably affect performance. But isn’t this an ugly hack we have to resort to? It shouldn’t be necessary.

By not covering the 10% in my own framework (I won’t speak for others) I am emphatically not hiding anything. The existing PDO methods remain exposed for the programmer to use when they are required. A framework should be an aid to using PHP to get work done quickly, not a replacement for knowing PHP itself and being able to invoke the native methods when their extra precision is required.

Quite possibly, yes. :slight_smile:

I’m trying to think of the practical benefits and drawbacks, and evaluate it that way. Let’s pretend for a moment that PDO lets us set the binded type like we would want it to. Even then, would it matter whether we declare the value’s type in PDO bind vs SQL cast? I can’t think of any benefits or drawbacks either way. Except that there are at least a couple cases where we would have no choice but to define the value’s type as a SQL cast, such as the ones you pointed out: bigint and possibly float.

Clarity and simplicity. IMO, PDO bind is the place to declare the type. It doesn’t make a lot of sense to send a value as string only convert it back to number with SQL cast - unless there is no other choice. It’s like writing

$a = 10 + (int) '5';

Besides, who would even know that the bound value is not sent properly if it’s not documented anywhere, or to be more precise, it works against the documentation?

Interestingly, mysqli prepared statements suffer from the same problem. They seem to have a way to declare the bound parameter as ‘double’ but it’s still sent as string. It must be buggy implementation in the PHP libraries or the PHP mysql driver. This problem does not exist when using prepared statements directly in SQL.