"Fingerprint" a credit card number

Storing a credit card number directly is dangerous and incurs severe liability of which I’m well aware. I’m wondering though, could one fingerprint a card using it’s md5 hash?

The odds against two credit cards having the same md5 is low

How low are the odds against two different credit cards #'s to

  • Share the same last 4 digits.
  • Share the same md5()
  • Share the same md5( strrev( $number ) )

I’m thinking that the odds of this are so ridiculously low as to make an effective fingerprint of when a card has been used before at your site without actually storing the card’s number. Thoughts?

And how long would it take a hacker to work their way backwards to the original card number if they had those two md5’s of the number. I don’t understand the algorithm, for all I know that information would make it ridiculously easy.

This is more of a thought exercise than something with serious application.

md5 the whole number is a good idea. Just make sure add a salt. That is a secret word that you only use for this particular application:

$salt=‘h^yjn(k,’;
$store=md5($credit_card_number.$salt);

The reason you add salt is so that people can’t compare md5 hashes and try to get the number that way.

Note that there is no reason for a salt to be secret in order for it to work.

The reason for using a salt is that while there are an infinite number of values that will produce a given MD5 hash, only a small subset of those will include the specified salt. This makes it much harder to find a value that will generate a given hash. If you don’t use a salt then there are rainbow tables that exist which provide a value that can be used to generate any possible hash.

As you are also using it specifically for credit card numbers you should also check that the value entered is a number containing between 13 and 16 digits inclusive. This too would reduce the number of possible values that will produce a given hash.

You might consider using sha1 instead of md5 - it will lessen the chances of two numbers producing the same hash and will also make it a lot harder to find a value that will produce a specific hash…

I would actually recommend the Hash functions with at least SHA2 or even Whirlpool.