How to convert from sha1 passwords to sha1+salt?

I have a web site which stores passwords as sha1 hashes in the database and I’d like to increase the security by adding salts. Is there a way to convert existing passwords in sha1 to sha1 with a salt? My guess is not but I want to make sure. My current plan is to do conversion individually at the moment a user logs in - he has to provide the password so then the system can hash it again with a salt and store it. Are there any better options? Ideally I’d like to convert all passwords in one go.

There is actually a way you can do this without having to do it on a per-user basis.

Presuming your current password hashing sceme is:


password_hash = sha1(raw_password)

You would change it to:


password_hash = hmac_sha1(sha1(raw_password),salt)

This would allow you to salt all existing passwords without knowing the original.

Thanks, this looks good, indeed the solution is very simple! I also didn’t know about the hmac functions - is hmac_sha1 with a salt as the key more secure than standard sha1 with a salt appended to the password?

I can’t see any way in which that would prevent any of the alternate passwords that hash to the same value from continuing to work. The purpose in using a salt is to reduce the number of password values that will match the hash and so you need to know the actual password in order to recreate the hash including the salt in order to limit the acceptable values to that one password and to disallow any other input that creates the same hash (since adding the salt means that the hash values for a list of passwords that would have produced the same hash will now all produce different hash values.

All hmac_sha1 does is to create a keyed hash where you need both the key and the original value to be passed separately in order to get a match with the hash. Since the function applies the key to the already hashed value it does nothing to block using any value for the password that matches the hash. It would prevent being able to use a rainbow table to convert the hashes back into values that will work as passwords if the database is compromised and someone manages to steal the content which would make it more secure than not using it. Because any value that produces the same hash will still work though it is nowhere near as secure as using a salt though.

To be able to apply a proper salt to a password you need the original password and not the hash and since there is no way to get the original password from the hash the only way of adding a salt or changing a salt is to do it when each person next logs in to the database. You’d use the password without the salt to match the existing hash and then apply the salt to create a new hash. Best way to handle it would probably be to add a new field to capture the new salted hash and only get rid of the old hash field once you have given everyone enough time to have logged in and generated their new salted hash.

Indeed, that is a good point, using the proposed hashing scheme each password would still have the same possible alternate passwords whereas appending a salt to the raw password completely changes the alternatives.

The purpose in using a salt is to reduce the number of password values that will match the hash

I’m not sure this is entirely true because regardless of whether a salt is used or not each hash will have an infinite number of alternate password values, isn’t this true? But I think I know what you mean - by using a salt it becomes much more difficult to use rainbow tables to find those alternate values so in practice it’s like reducing their number.

and so you need to know the actual password in order to recreate the hash including the salt in order to limit the acceptable values to that one password and to disallow any other input that creates the same hash

As I said, I don’t think it’s possible to disallow any other input using a hash. It’s possible to find many other possible passwords for the same hash even when the salt is used, it’s just much more difficult, because an attacker would need to produce a separate rainbow table for each salt. I think completely disallowing any other input is possible only using two-way encryption.

All hmac_sha1 does is to create a keyed hash where you need both the key and the original value to be passed separately in order to get a match with the hash. Since the function applies the key to the already hashed value it does nothing to block using any value for the password that matches the hash. It would prevent being able to use a rainbow table to convert the hashes back into values that will work as passwords if the database is compromised and someone manages to steal the content which would make it more secure than not using it.

That’s what I was thinking about - to protect users passwords from being discovered when someone steals the database.

Because any value that produces the same hash will still work though it is nowhere near as secure as using a salt though.

But when using a salt there can still be possible alternate values. I don’t yet fully understand how that would change anything, I suppose it could make the system more secure against brute force attacks because the alternate passowords would be very difficult to guess.

Best way to handle it would probably be to add a new field to capture the new salted hash and only get rid of the old hash field once you have given everyone enough time to have logged in and generated their new salted hash.

I’m still thinking about the best way to go with it. The problem is there may never come a time when all users have logged in and changed their password hashes to the new salted version. Because HarryR’s solution is better than what is now I can change all users hashes to hmac_sha1 algorithm (salting the hash) and on top of that when a user logs in then recreate their hash with proper salt appended to plain password. At least all passwords would be immediately protected against database theft.

I have to concur that the best way I could think of would be as follows:

  1. capture the plaintext password as the user logs in
  2. check it against the database without the salt
  3. if its successful then rehash with the salt and save to database. increment a counter or some other method to keep track of how many users have had their pword converted

that’s the gist. there is a drupal mod that does something similar and which ive used as a basis for just such an algorithm

But only a very small fraction of those will contain the salt and since the salt is added after the password is input it would need to be a value that with the salt added produces the appropriate hash. Assuming that you also limit the length of a password to a reasonable value (say not more than a few thousand characters) then there will probably be only the one value that contains the salt and which will hash to the specific value.

The method I described is exactly the same as what the HMAC-SHA1 algorithm does when the key length is longer than 64 bytes, hashing a uniquely user-identifiable piece of information using their password as the key is as strong as it gets.

BS because of aforementioned reasons. If you don’t use a unique salt for every user then that’s the problem.

No - the problem is if you don’t add the salt before you hash the password. Adding a salt after hashing the password just means that and value that creates that hash will work as the password where the whole point in using a salt is to eliminate all of the values that will generate that specific hash that do not contain the salt.

The difference between using a single salt on all passwords and using a separate salt for each is far less in terms of security than the difference between adding a salt after hashing and adding one before. Basically from most secure to least secure you have:

  1. separate salt added before hashing
  2. single salt added before hashing
    .
    .
    .
    .
    .
    .
    .
    .
  3. no salt added before hashing but added after
  4. no salt added at all

Another consideration is that hashing was never intended for security in the first place. A basic hash is intended to be used for tamper detection. You hash a message or program and supply the hash along with the message or program. The person receiving the message or program can then produce the hash again and compare that to the one supplied in order to confirm that the message or program hasn’t changed since the original hash was produced. hmac_sha1 is intended for use for authentication as the key gets given to the intended recipient by the sender and both use that same key with all the hashes on the messages they pass back and forth so as to confirm both that the message hasn’t been tampered with and that it actually comes from the other person who knows the key. Neither of these uses have anything at all to do with security of passwords.

I think there must be a hole in this reasoning because you seem to be expecting the impossible: a (almost) 1-to-1 mapping of all possible short hash values to all possible long password values. Let’s say you store passwords as sha1 40-character hexadecimal hashes (0-9, a-f) and let’s assume users use only 0-9 and a-f for passwords and their passwords are 400 characters long. It’s impossible you won’t run into many collisions because you are squeezing 400 bytes into 40 bytes and salts don’t change this simple fact. You will never have enough 40-character combinations to represent all possible 400-character combinations, therefore for each 40-byte hash you will have many possible 400-byte passwords. Now add to the fact that passwords can contain all other letters, lower and upper case, special characters, can be of variable length (up to thousands of characters!) and the difference is magnified even more.

The salt will not reduce collisions, it will only change them to different ones and make them much harder to guess.

You are correct that a salt will not reduce the number of total possible values that get mapped to a given hash. What a salt does is to reduce the number of those that are valid since only those that contain the salt are now acceptable. That effectively eliminates 99.99999999999%+ of the possible values that map to the hash but do not contain the salt. That’s the whole point in using a salt in the first place since not just any value that produces a given hash can be used - only the small subset that contain the salt (typically only one if the password is a realistic length) will work because all of the other possible values within that length for producing the hash don’t contain the salt.