How often do you use identical comparison (===
) as opposed to the regular (equals) one (==
)?
Like the defensive 'value' == $variable
check (rather than having the operands the other way around), it’s an easily implementable safeguard against slipups, as this Reddit thread notes.
In short, in situations like these:
var_dump(md5('240610708') == md5('QNKCDZO'));
var_dump(md5('aabg7XSs') == md5('aabC9RqS'));
var_dump(sha1('aaroZmOk') == sha1('aaK1STfY'));
var_dump(sha1('aaO8zKZF') == sha1('aa3OFF9m'));
var_dump('0010e2' == '1e3');
var_dump('0x1234Ab' == '1193131');
var_dump('0xABCdef' == ' 0xABCdef');
All the dumps will yield true
in the most recent version of stable PHP and HHVM. Interestingly, the last line yields false
in PHP 5.2 and lower, while in the most recent builds of PHP 7, the last two are false. Why is that?
Let’s leave aside the fact that we should use strcmp
and hash_equals
for string and hash comparisons and play around with regular operators…
It turns out numeric comparison takes priority over string comparisons in PHP. In other words, if a string can be interpreted as a number, it will be. The manual clearly states:
If you compare a number with a string or the comparison involves numerical strings, then each string is converted to a number and the comparison performed numerically.
The hashes (both md5 and sha1 pairs) are giving true
because they start with 0eX
(where X is any string of numbers), which translates 0 * 10^X
when looked at as a number. As any number multiplied by zero is zero, the comparison yields true
.
But what’s up with the rest? The fifth case, '0010e2' == '1e3'
says true, but that can’t be right - can it? Let’s see.
When coercing into a number, PHP first strips the leading zeros, so 0010e2
becomes 10e2
. We said above what e
means. Thus, we have:
10e2 == 1e3
10 * 10^2 == 1 * 10^3
1000 == 1000
OK, so what about the last two?
0x1234Ab
is actually a numeric string, albeit a hexadecimal one. In fact, simply converting the number into decimal produces 1193131
. But… why does the newer PHP7 say this is false?
Similarly, there’s '0xABCdef' == ' 0xABCdef'
.
Obviously, these two strings are not equal when compared as strings, but as numbers, the whitespace is stripped, making them equal. Still, why does PHP7 again say the comparison’s result is false? In fact, the only PHP version saying this is false is either one that is very old, or very new, as evident here.
Interestingly, if we strip the whitespace out manually, super-old versions of PHP do say the strings are equal, while new PHP still seems to compare types as well.
So what gives? Did they add type checking into comparison operators in PHP7? Obviously not, as this does not happen with non-hex numbers.
[…]
Edit: thanks to @TomB, a relatively unknown RFC was identified as the culprit.