Secure Password Storage Repository

logic_earth · August 8, 2012, 11:06am

Lets create a repository of various techniques to securely hash and store passwords. Something which is overlooked by many people.

One of the most important factors, I believe, when handling passwords is no restrictions. Don’t force a certain character sets to be used or force complexity, do not limit size (min size is okay), etc. In other words, let the user enter whatever it is they please. If I can enter a long complicated SQL string into the password field and not have any issues, you are doing it right…maybe?

When I get a password I barely touch it, I throw it into a hashing function as quickly as possible. The moment it goes into that hashing function it is completely sanitized and devoid of any nasties that may harm the system. It won’t matter what the user enters or how big it might be, the hash digest it does to a set pattern. Thus you don’t have to filter, validate or escape the password.

Another technique I use is salting and peppering (might not be official terms), I have a user specific salt created for each user, then for every site there is a key for peppering. If an attacker manages to break into the database and get all the hashed passwords and salts they can still create a rainbow table it will just take a little longer (GPUs speed this up greatly). However, with the additional pepper, not only will they need the database but also the source code for the site. Not improbable but much more difficult to attain both.

For the sake of example, below I will generate both the site key and the user key dynamically each time the script is ran. The examples are also using PHP, if you have GOOD examples in another languages you would like to share please post them.


$example_sitekey = mcrypt_create_iv( 4096, MCRYPT_DEV_RANDOM ); // Static in Production
$example_userkey = mcrypt_create_iv( 4096, MCRYPT_DEV_RANDOM ); // Generated during registration

You may be asking yourself, “Why am I using mcrypt? Is that not for encryption?” Indeed it is, mcrypt_create_iv is for creating initialization vectors for various ciphers in mcrypt. However, it can be exploited and used as a [URL=“https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator”]Cryptographically Secure PseudoRandom Number Generator (CSPRNG) both on Linux/Unix and Window systems since version 5.3. (Various versions of Linux/Unix, however, can have different implementations and results affecting security, be aware of that.)

Moving onto the next line.


$userid = '5ccd875c-e10f-11e1-8b47-d818ed2c1c51'; // Gen UUID at registration
$username  = 'Logic';
$password  = 'MyNotSoSecurePassword!!11!!eleven';

$password = "auth://{$username}:{$password}@auth.example.com/{$userid}/";

This just happens to be some of the data I have for users, every user as a unique name and UUID associated with their accounts. Now the password…quite odd putting it into a URI format…why? Well simply to make it more unique and to pad its length. In case a user entered a short password. The longer the password the better it can be digested. Plus I like the [URL=“https://en.wikipedia.org/wiki/URI”]URI format, so sue me.


function hmac ( $data, $key, $raw = true )
{
  // sha512 blocksize = 1024
  $blockSize = 1024;

  $len = strlen( $key );
  if ( $len < $blockSize ) {
    $key = str_pad( $key, $blockSize, "\\0" );
    $len = $blockSize;
  }

  $opad = str_repeat( chr( 0x5C ), $len );
  $ipad = str_repeat( chr( 0x36 ), $len );

  for ( $i = 0; $i < $len; $i++ ) {
    $opad[ $i ] = $opad[ $i ] ^ $key[ $i ];
    $ipad[ $i ] = $ipad[ $i ] ^ $key[ $i ];
  }

  return hash( 'sha512', $opad .
    hash( 'sha512', $ipad . $data, true ), $raw );
}

Now I know PHP already has hash_hmac available. However, I have one problem with the default spec implementation of [URL=“https://en.wikipedia.org/wiki/HMAC”]HMAC and that is the key truncating for long keys (which I use). I wanted to retain the length of the key as much as possible. As defined: “K be a secret key padded to the right with extra zeros to the input block size of the hash function, or the hash of the original key if it’s longer than that block size” That is pretty much the only change.


function stretch ( $data, $key, $iteration = 64 )
{
  $keyLength = ceil( $iteration / 2 );
  $output = '';

  for ( $block = 1; $block <= $keyLength; $block++ ) {
    $last = $xor = hmac( $data, $key . pack( 'N', $block ) );

    for ( $i = 1; $i < $iteration; $i++ )
      $xor ^= ( $last = hmac( $data, $last ) );

    $output .= $xor;
  }

  return hmac( $output, $key );
}

Key stretching based on [URL=“https://en.wikipedia.org/wiki/PBKDF2”]PBKDF2 (not strict), fairly common these days, increasing the computation time to hash, slowing down brute force rainbow tables. You don’t need a high iteration to get good results…too high and you might kill your server. Having stronger keys when hashing is far better, but we can still use key stretching to make a weak key stronger by using it.

Alright now, that is all the major code, simplified greatly of course. Here is how it might all go together: ($db might be how it could be stored)


$password = stretch( $password, $example_userkey );

// Prepare for storing:
$output = hmac( $password, $example_userkey );
$output = hmac( $output, $example_sitekey );

$db = base64_encode( $output . "\\0\\0\\1\\0\\0" . $example_userkey )
// nul-nul-soh-nul-nul patten seperates the salt

Any questions or improvements?

KyleWolfe · August 8, 2012, 4:44pm

So I’m guilty of not reading the whole post, but something caught my eye…

$db = base64_encode( $output . “\0\0\1\0\0” . $example_userkey )

Are you really storing passwords as salted base 64 strings?

logic_earth · August 8, 2012, 4:49pm

No. Look above at the $output variables.

Jeff_Mott · August 8, 2012, 9:06pm

Any questions or improvements?

Just to warn you upfront… I have more criticisms than praises.

when handling passwords … Don’t force a certain character sets to be used … do not limit size

I wholeheartedly agree with this. There’s no good reason to restrict what characters can be used in a password.

Another technique I use is salting and peppering

Peppering is an interesting idea. It could provide extra protection in case the worst case scenario happens and your entire database is exposed. But I think it’s worth noting for the readers that peppering is a new and non-standard technique. It’s so unknown that there isn’t even a wikipedia page for it.

If an attacker manages to break into the database and get all the hashed passwords and salts they can still create a rainbow table…

Just a small technicality. An attacker could try a brute force search of each password. Rainbow tables, however, would still be impractical because the salt would still be different for each user.

Now the password…quite odd putting it into a URI format…

This is beyond non-standard. If your goal here is to write a guide for other users, then I think you should leave out the non-standard stuff until it’s been vetted by both the web and crypto communities.

URI format… why? Well simply to make it more unique and to pad its length.

You’re padding every password in exactly the same way with a non-secret and non-random string. That doesn’t make any password more unique, nor will it increase security.

The longer the password the better it can be digested.

I’m not sure that’s true. In fact, crypto hashing algorithms are designed specifically so that this won’t be true. Any input string has an equal chance to producing any output hash.

Plus I like the URI format, so sue me.

You’ll be hearing from my lawyers.

Now I know PHP already has hash_hmac available. However, I have one problem with the default spec implementation of HMAC and that is the key truncating for long keys (which I use).

This is another case where I think you’d be better off teaching only standard techniques until and if your custom techniques are vetted. To say the key is truncated, I think, leaves the wrong impression. Actually the key is hashed, which means 128 bits (or more, depending on the hasher) of the key’s entropy is retained, and 128 bits is plenty sufficient to be secure. In theory, your change could be better, but in practice, it isn’t any more secure. I think we’d be better off staying standard and using the hmac algorithm as it’s defined to work.

Key stretching based on PBKDF2

Why not just use the standard and widely tested PBKDF2? The way your stretch function works seems a bit arbitrary. Why generate “interations / 2” number of blocks? If I tell your stretch function to use 100 iterations, then it will generate 50 blocks at 100 iterations for each block, for a grand total of 5000 iterations. And, again, if you’re writing a guide for other users, then I think you should stick to the standard algorithms and practices. I don’t think new and custom solutions belong in this kind of write-up.

$output . “\0\0\1\0\0” . $example_userkey

The characters in that separator are all valid characters for both the output and the user key, which makes it not a very reliable separator. Probably the best method to store those two pieces of data are to give each its own column in your database: one column for the hash and one column for the salt. It’s straightforward, easy, and reliable.

logic_earth · August 9, 2012, 2:43am

Well clearly as mention peppering has no official name, it is still considered salting. I personally call it salting and peppering to distinguish it from the regular single salting. But it is still salting, not new and not non-standard just a different name for the same thing.

This is beyond non-standard. If your goal here is to write a guide for other users, then I think you should leave out the non-standard stuff until it’s been vetted by both the web and crypto communities.

…

You’re padding every password in exactly the same way with a non-secret and non-random string. That doesn’t make any password more unique, nor will it increase security.

…

I’m not sure that’s true. In fact, crypto hashing algorithms are designed specifically so that this won’t be true. Any input string has an equal chance to producing any output hash.

This does not need to be vetted, throwing the password into a URI format (or any other format) that is going to be hashed is perfectly sensible. Nor is the point to increase security but it is one more thing the attacker will have to get right in their brute force. And of course hashing algorithms can handle small string smaller then their blocksize. But the more data you feed them the better they work otherwise they get null padded.

This is another case where I think you’d be better off teaching only standard techniques until and if your custom techniques are vetted. To say the key is truncated, I think, leaves the wrong impression. Actually the key is hashed, which means 128 bits (or more, depending on the hasher) of the key’s entropy is retained, and 128 bits is plenty sufficient to be secure. In theory, your change could be better, but in practice, it isn’t any more secure. I think we’d be better off staying standard and using the hmac algorithm as it’s defined to work.

I would prefer to retain the entropy of my keys, which are much larger then 128 bits. For the sake of this example I used much simpler key generation, but they too still have much more entropy then what would have been retained. Truncating or hashing the key down is not what I want it to do, otherwise I would have done it myself. It follows the HMAC spec as closely as possible minus that single part.

Why not just use the standard and widely tested PBKDF2? The way your stretch function works seems a bit arbitrary. Why generate “interations / 2” number of blocks? If I tell your stretch function to use 100 iterations, then it will generate 50 blocks at 100 iterations for each block, for a grand total of 5000 iterations. And, again, if you’re writing a guide for other users, then I think you should stick to the standard algorithms and practices. I don’t think new and custom solutions belong in this kind of write-up.

It is the standard PBKDF2, only the parameters are changed like dkLen which is $keyLength. And instead of truncating at the end, where they ignore most of the computations by cutting (substr) it to dkLen, the entirety of the result is hashed together so all those computations actually mattered in the end. ( If you ignore the computations like that, an attacker can use a smaller iteration and achieve the same result. ) It satisfies the PBKDF2 spec that actually matters.

The characters in that separator are all valid characters for both the output and the user key, which makes it not a very reliable separator. Probably the best method to store those two pieces of data are to give each its own column in your database: one column for the hash and one column for the salt. It’s straightforward, easy, and reliable.

Well sure they can be contained within each, that is why you need to form a pattern that is least likely to be present in the data. But again it is just an example of how it might be done.

Jeff_Mott · August 9, 2012, 3:58am

I don’t agree. They have different requirements and solve different problems. A salt can be public and protects against rainbow tables. A pepper must be private and protects against brute force searches.

The URI format is harmless, but nonetheless useless. And if you were to get the opinion of the crypto community, I think they would tell you the same thing. Which is also why I think these ideas do need to be vetted before we include them in any learning guides.

I still don’t think that’s correct, and I would need to see some research and consensus from the crypto community before I believed that. Do you have any references to back up this statement?

If it’s supposed to work the same as PBKDF2, then, frankly, you have bugs in your code. It should only need to generate one block. Instead it generates 32 blocks. Also, a key length of half the number of iterations is still arbitrary. What made you pick that? Why not pick a constant key length such as 128, 256, or 512 bits?

logic_earth · August 9, 2012, 4:28am

It is none the less a salt. One is static to the site and one is static per-user. The requirements are irrelevant as a salt is defined as a series of random bits, creating one of the inputs to a one-way-function (hash). What you use that salt for is not important, it is still defined as a salt. The definition of a salt does not have any stipulation that it is public or that it cannot be private there is no such requirement. Again, I use the term peppering to define the second salt.

The URI format is harmless, but nonetheless useless. And if you were to get the opinion of the crypto community, I think they would tell you the same thing. Which is also why I think these ideas do need to be vetted before we include them in any learning guides.

If it is harmless what is to be vetted?

I still don’t think that’s correct, and I would need to see some research and consensus from the crypto community before I believed that. Do you have any references to back up this statement?

Its been a while, lost to the sands of time my original source. But it does come down to the way block ciphers work, on fixed chunks of data.

If it’s supposed to work the same as PBKDF2, then, frankly, you have bugs in your code. It should only need to generate one block.

Then everyone else got it wrong. I have looked over countless sources all in different languages. They all do the same thing. All have for nest inside another for. The outer loop is looping on dkLen the inner loop on the iteration. They all DO THIS. If I am wrong then all other implementations are wrong.

Instead it generates 32 blocks. Also, a key length of half the number of iterations is still arbitrary. What made you pick that? Why not pick a constant key length such as 128, 256, or 512 bits?

I wanted it to be dynamic to increase as the computation power increases. Acting as a multiplier for the iteration.

Here PBKDF2 implemented in Javascript:

github.com

bitwiseshiftleft/sjcl/blob/version-0.8/core/pbkdf2.js

/** @fileOverview Password-based key-derivation function, version 2.0.
 *
 * @author Emily Stark
 * @author Mike Hamburg
 * @author Dan Boneh
 */

/** Password-Based Key-Derivation Function, version 2.0.
 *
 * Generate keys from passwords using PBKDF2-HMAC-SHA256.
 *
 * This is the method specified by RSA's PKCS #5 standard.
 *
 * @param {bitArray|String} password  The password.
 * @param {bitArray} salt The salt.  Should have lots of entropy.
 * @param {Number} [count=1000] The number of iterations.  Higher numbers make the function slower but more secure.
 * @param {Number} [length] The length of the derived key.  Defaults to the
                            output size of the hash function.
 * @param {Object} [Prff=sjcl.misc.hmac] The pseudorandom function family.
 * @return {bitArray} the derived key.

This file has been truncated. show original



[COLOR=#999988][I]/** @fileOverview Password-based key-derivation function, version 2.0.[/I][/COLOR]
[COLOR=#999988][I] *[/I][/COLOR]
[COLOR=#999988][I] * @author Emily Stark[/I][/COLOR]
[COLOR=#999988][I] * @author Mike Hamburg[/I][/COLOR]
[COLOR=#999988][I] * @author Dan Boneh[/I][/COLOR]
[COLOR=#999988][I] */[/I][/COLOR]


[COLOR=#999988][I]/** Password-Based Key-Derivation Function, version 2.0.[/I][/COLOR]
[COLOR=#999988][I] *[/I][/COLOR]
[COLOR=#999988][I] * Generate keys from passwords using PBKDF2-HMAC-SHA256.[/I][/COLOR]
[COLOR=#999988][I] *[/I][/COLOR]
[COLOR=#999988][I] * This is the method specified by RSA's PKCS #5 standard.[/I][/COLOR]
[COLOR=#999988][I] *[/I][/COLOR]
[COLOR=#999988][I] * @param {bitArray|String} password  The password.[/I][/COLOR]
[COLOR=#999988][I] * @param {bitArray} salt The salt.  Should have lots of entropy.[/I][/COLOR]
[COLOR=#999988][I] * @param {Number} [count=1000] The number of iterations.  Higher numbers make the function slower but more secure.[/I][/COLOR]
[COLOR=#999988][I] * @param {Number} [length] The length of the derived key.  Defaults to the[/I][/COLOR]
[COLOR=#999988][I]                            output size of the hash function.[/I][/COLOR]
[COLOR=#999988][I] * @param {Object} [Prff=sjcl.misc.hmac] The pseudorandom function family.[/I][/COLOR]
[COLOR=#999988][I] * @return {bitArray} the derived key.[/I][/COLOR]
[COLOR=#999988][I] */[/I][/COLOR]
sjcl.misc.pbkdf2 [B]=[/B] [B]function[/B] (password, salt, count, length, Prff) {
  count [B]=[/B] count [B]||[/B] [COLOR=#009999]1000[/COLOR];
  
  [B]if[/B] (length [B]<[/B] [COLOR=#009999]0[/COLOR] [B]||[/B] count [B]<[/B] [COLOR=#009999]0[/COLOR]) {
    [B]throw[/B] sjcl.exception.invalid([COLOR=#DD1144]"invalid params to pbkdf2"[/COLOR]);
  }
  
  [B]if[/B] ([B]typeof[/B] password [B]===[/B] [COLOR=#DD1144]"string"[/COLOR]) {
    password [B]=[/B] sjcl.codec.utf8String.toBits(password);
  }
  
  Prff [B]=[/B] Prff [B]||[/B] sjcl.misc.hmac;
  
  [B]var[/B] prf [B]=[/B] [B]new[/B] Prff(password),
      u, ui, i, j, k, out [B]=[/B] [], b [B]=[/B] sjcl.bitArray;


  [B]for[/B] (k [B]=[/B] [COLOR=#009999]1[/COLOR]; [COLOR=#009999]32[/COLOR] [B]*[/B] out.length [B]<[/B] (length [B]||[/B] [COLOR=#009999]1[/COLOR]); k[B]++[/B]) {
    u [B]=[/B] ui [B]=[/B] prf.encrypt(b.concat(salt,[k]));
    
    [B]for[/B] (i[B]=[/B][COLOR=#009999]1[/COLOR]; i[B]<[/B]count; i[B]++[/B]) {
      ui [B]=[/B] prf.encrypt(ui);
      [B]for[/B] (j[B]=[/B][COLOR=#009999]0[/COLOR]; j[B]<[/B]ui.length; j[B]++[/B]) {
        u[j] [B]^=[/B] ui[j];
      }
    }
    
    out [B]=[/B] out.concat(u);
  }


  [B]if[/B] (length) { out [B]=[/B] b.clamp(out, length); }


  [B]return[/B] out;
};

Jeff_Mott · August 9, 2012, 4:48am

Other implementations will generate blocks up until they have at least keyLength bytes of output. That means your for loop should look like this:

for ( $block = 1; strlen($output) <= $keyLength; $block++ ) {

logic_earth · August 9, 2012, 4:52am

It is being used as a multiplier in this case. There is no bug in my code.

I know you are trying to help, but maybe less grasping at straws that don’t actually effect the integrity of the security these functions provide. None of my changes affects the integrity to these functions.

2ndmouse · October 1, 2012, 8:25am

Not sure if I fully understand logic_earth’s methodology and maybe I’m being too simplistic here (I like ‘simple’). I use the following code to store and retrieve passwords - I believe it’s secure - hope it’s of some help:


//encryption - then stored in db
$key = 'NiceLongStringValue';
$pass = base64_encode(mcrypt_encrypt(MCRYPT_RIJNDAEL_256, md5($key), $string, MCRYPT_MODE_CBC, md5(md5($key))));

//decryption - after retrieving from db
$key = 'NiceLongStringValue';
$decrypt = rtrim(mcrypt_decrypt(MCRYPT_RIJNDAEL_256, md5($key), base64_decode($db_pass), MCRYPT_MODE_CBC, md5(md5($key))), "\\0");
$pass = trim($decrypt);

If not of help or is not secure - constructive criticism welcome

Jeff_Mott · October 1, 2012, 8:46am

-ish.

Standard practice is to hash salted user passwords. Hashes are one way. No one can discover the original passwords, not even you. But you’re not hashing, you’re encrypting (and not salting either). If an attacker can access your code as well as your database, then he’ll have the key, and he can decrypt all the passwords. That’s a worst case scenario, obviously, but it happens.

2ndmouse · October 1, 2012, 8:51am

Damn - thank you anyway

2ndmouse · October 1, 2012, 9:48am

There again, if they’ve gained access to the code and the db - you’re in big trouble anyway

Jeff_Mott · October 1, 2012, 10:20am

Absolutely. But if your users use the same password on multiple sites, which is unfortunately common, then a break-in on your site could avalanche into a break-in on multiple sites. For larger organizations, you also have to worry about internal data theft by employees who have easy access to both the code and the database. Salting and hashing is the catch-all solution. No one can access to the plaintext password, because it’s never stored.