Next steps in cryptography?

Let this sort of be a part two to another post I started last year: http://www.sitepoint.com/forums/showthread.php?904383-Let-s-talk-security

There was a portion of the thread where people believed that brute forcing rather than using a rainbow table just wasn’t logical. With my recent experience in multi threaded application writing / research, and my beginning research into cluster computing, I can see this being untrue, and more recently this popped up in my news feeds: http://www.dailymail.co.uk/sciencetech/article-2331984/Think-strong-password-Hackers-crack-16-character-passwords-hour.html

So there was some talk about running MD5 more than once on the string (only helps if source code has not been revealed to hacker). Ensuring that you have a unique salt per row (this I can tell you will help but wont slow anyone down who knows what they are doing)… But it seems to me that the only way to stay ahead of cluster computing is security by obscurity, and masking the way the you prepare your hash.

Let the discussion begin… again… What are your thoughts and suggestions?

Predicting the future of technological innovation is at best a guess but in the interim we can expect folk to be forced into using longer passwords. Who knows maybe we’ll come full circle and once a month the bank will send a courier over with something like a cheque book of one-time-pad of passwords.

Of course what’s really worrying (and the article fails to mention) is that the brute forcing setup that was used by Ars Technica is niether new or unique to them. I could mention the name of the freely available software used but it would probably be deleted by admin Stories have been popping up for over five years now about home brew cluster servers using everthing from Play Staions to high end graphics cards being used by amateurs interested in cryptography. Some have even demonstrated how ‘cloud’ services from the likes of Amazon & Google can be configured to crack passwords even faster than described in that article.

Simply locking the account for a period after each wrong password attempt will defeat any brute force attack. If a password will only be accepted if at least 5 or 10 seconds has passed since the last login attempt then it doesn’t matter how many processors you throw at it as testing passwords faster than the lock period will simply keep the account locked and result in the actual password being rejected along with all the invalid guesses - unless they get lucky and get the password right on the very first try.

Rainbow tables only work if you have access to the hash of the password and also the code that generated the hash.

I’m afraid my advice isn’t going to be much different than what I gave in the other thread.

  1. Unique salt. This neutralizes rainbow tables, and it ensures the attacker has to brute-force each password individually.

  2. Key strengthening (aka, iterations). This uses computing power to counter computing power.

The password list from the article that was so easily hacked did neither of these things.

I very much disagree! Security by obscurity is never the answer. I don’t understand why people keep falling for this trap. The problem with security through obscurity is that the things we come up with are mostly quite trivial (rot13 and/or base64 anyone?) and very easy to reverse engineer with a bit of brute forcing.
Take CSS for example (the DVD encryption, not Cascading Style Sheets), which is proprietary, and got hacked without much effort. Here in the Netherlands we got a new system to pay for public transport, proprietary encryption, and it was hacked within a few weeks (everyone with a card reader of ~30$ could “check in” and “check out” at home, effectively travelling for free).

The common misconception is that if the encryption/decryption algorithms are known, it’s easy to decrypt something, but this is absolutely not true. If anything, it is a good thing that the algorithms are out in the open so anyone can verify that they work and aren’t flawed. Plus, if you know how decryption works, you can calculate how long it takes to brute force, and give up before you even try because it’s just impossible.

The trick to stay ahead of cluster computing is, and will always be, strong keys.

Im referring to obscuring the way that you hash. I’m not at all suggesting abondoning the usual salt techniques. What I am saying, is that if I have a nice cluster set up, and I have accessed your users table, along with the way the you hash your passwords, I AM GOING TO BRUTE FORCE IT. All of them.

Sure, felgall, brute forcing over a network can be stopped. Everyone knows that.

Jeff, I’ve gone down this road with you before I’m afraid. Iterations of a key will not help out if I know how you created your hash (how many iterations and when and how you implemented your salt).

As a hacker who has downloaded a nice hunk of usernames and hashed passwords, I will set them up in an indexed table (index on the hashed value), ensure that they are stored in memory, and then begin a concurrent brute force, each hash will check to see if it exists in the table, and on match, spit out the password used and the username that went with it to a results table.

I’m sorry guys but my point with this thread is there is no countering computing power with computing power with how quickly concurrent computing is coming along. Hand me a table of users, unqiuely salted or not, and how the hash was built and I’ll show you an example on EC2.

My point here folks is that we should start looking at ways to hide the way in which we hashed our passwords.

Of course it helps. (It’s standard practice for a reason.) It increases the time it takes to perform a brute-force search. So, for example, if you iterate the hash 5000 times, then a brute-force search will take 5000x longer to complete. This one factor alone would turn a 1-hour brute-force search into a 7-month search. That’s a huge difference.

Does what I said make sense? Or do you still want a table of users?

Your going to hash your passwords 5k times on the fly every time someone wants to log in?

Yup. The Ars Technica article you referenced mentioned that 5000 is the number of iterations Mac OS X uses. And computers are fast enough that 5000 hash iterations won’t hinder normal response times.

Its not the user’s computer that you have to think about. Its your server that is handling multiple users all at the same time. While 5000 iterations is fine for a personal computer only handling one user. Once you are on a server handling hundreds of users ever hour, it takes a huge toll.

For the record, I use about 256 iterations using salting and peppering. Or two salts, one unique to the user the other is a site key. But I use very large salts as well. First pass, I put the password though 128 iterations with the generated user’s key, a second pass of 128 iterations after the first with the site key. So an attacker would need two points of data, the DB records and the source code. But they will also need my implementation as I have made subtle changes the the behavior (which do not effect security) of the hashing and iteration methods.

Example of how I get my salts:


$example_sitekey = mcrypt_create_iv( 4096, MCRYPT_DEV_RANDOM ); // Static in Production 
$example_userkey = mcrypt_create_iv( 4096, MCRYPT_DEV_RANDOM ); // Generated during registration

Holy hell. You weren’t kidding. A normal recommended salt size is 64 bits. Or if we want to be ultra secure, we could pick a salt the same size as the hash. That might be, for example, 128- or 256 bits. Anything beyond the size of the hash is redundant.

Your salts are 32,768 bits…

Yup large and overly redundant, makes no different just as long as its not small. The more data a digest can chew on the better.

The more data a digest can chew on the better.

I know you didn’t take my word for it last time, but I’m still hoping you’ll run your ideas by the crypto community. If I’m full of crap, then they’ll praise you for your strong security. But if I’m right, and you’re tinkering with crypto algorithms based on some misconceptions of how they work, then it’s important to find out sooner than later.

Its not a misconception. Algorithms based on a block ciphers work on blocks, the more blocks of data you have the more rounds it goes though to compress that data into the digest with less padding required.

The Merkle–Damgård hash function first applies an MD-compliant padding function to create an output whose size is a multiple of a fixed number (e.g. 512 or 1024) — this is because compression functions cannot handle inputs of arbitrary size. The hash function then breaks the result into blocks of fixed size, and processes them one at a time with the compression function, each time combining a block of the input with the output of the previous round.

And the hashing function I use, Whirlpool is a Merkle–Damgård construction based on AES.

Do I need such large salts? No. Does it hurt anything? No.
If I wanted I could throw my salts into a hashing function and make them smaller if it will make you feel better.

Algorithms will pad the data to a multiple of the block size, but less padding will not increase the security of the hash. Nor will adding redundant blocks of data. But don’t argue the point with me. Talk to the crypto community.

I re-read this sentence, and even to me, this came out sounding like, “I’m right. Don’t argue.” That’s not how I meant it, so let me apologize in advance. What I meant was, neither of us are cryptographers, and we should instead talk with people who have a greater depth of knowledge on this subject, especially if we’re going to tinker with the standardized algorithms that are known to be secure.

Except there isn’t any tinker of the standardized algorithms going on here. Feeding in a large salt does not tinker with it. But we already agreed that it DOES NOT MATTER.

However, padding can have an impact on security. See: https://en.wikipedia.org/wiki/Length_extension_attack

The presence of padding is not what makes a length extension attack possible. I don’t see how the wiki article would even give you that impression. It’s mentioned only as one more thing an attacker would have to reconstruct. And in any case, the HMAC algorithm already protects against length extension attacks. Trying to control the amount of padding gets you zero added security.

And I’m not even asking you to take my word for it. Please, please talk to the crypto community. If what you’re doing is good and correct, then peer review can only give you added confidence.

That’s what this discussion has been assuming.