Thursday, November 03, 2011

How to calculate Password Strength...

When I visit websites that want me to create an account before doing something, I typically enter in bogus information and occasionally I see a "password meter" that determines that I've entered a "weak" password.  At least it is considered "weak" by some systems and "average/strong" in others.  Being the curious sort of person, I've been trying to come up with a good, consistent strategy for calculating password strength and then something useful to do with it.  I assume most developers only want to write password strength code one time, do something useful with it, and then move onto the next task.

What constitutes a strong password?  An excellent question and something the industry seems to have difficulties figuring out at the moment.  NIST, the National Institute of Standards and Technology, has a few words to say on the topic.  Basically, password strength boils down to the number of bits of entropy that a password has.

So the next question is:  How does one calculate the number of bits of entropy of a password?  NIST has proposed the following rules:

  • The first byte counts as 4 bits.
  • The next 7 bytes count as 2 bits each.
  • The next 12 bytes count as 1.5 bits each.
  • Anything beyond that counts as 1 bit each.
  • Mixed case + non-alphanumeric = up to 6 extra bits.

After thinking about this for a while and crunching some numbers, the NIST proposal seemed decent enough.  Plus, implementing the algorithm in software is easy.  Unlike a lot of programmers, I don't take algorithms like this at face value and just implement them blindly.  I like to test their claims first and sometimes introduce my own ideas.

During my research phase, I also ran across this xkcd web comic:

While I don't generally care for xkcd comics, this one actually made me think.  Is "correcthorsebatterystaple" actually 44 bits of entropy?  I immediately said to myself "no" but, being a sucker for self-inflicted punishment, I went to check.

Modern security is best done by not "handing out" bits of entropy.  Basically, you only want to say that something has "at most 'x' bits of entropy" if that something passes a number of tests.

My initial calculations of "correcthorsebatterystaple" against the NIST rules result in a maximum of 41 bits of entropy.  So, xkcd handed out three undeserved bits.  Three bits doesn't sound like a lot, but it is the difference between 550 years and 69 years.  Also, this is a phrase using words from the dictionary and the letters of the phrase have a lot of repetition, so I modified the NIST algorithm to only count a multiplier of 75% of each repeat character (e.g. the first 'r' counts for 100% of the bits, the next 'r' counts only for 75% of the bits, etc).  This drops the number of bits of entropy to 34.2.  My own algorithm runs some more tests but the result still comes out to 34 bits.  This is 10 bits fewer than the xkcd comic's claim for that strategy to selecting a password.  Or, more simply put, roughly half a year at 1000 guesses/sec.  Not 550 years.

For the "Tr0ub4dor&3" part of the comic, my same base algorithm gives me 22.1 bits, not 28 bits.  Or roughly 1.2 hours at 1000 guesses/sec.  However, the extra tests in my algorithm actually calculate it as having about 14 bits of entropy.

Of course, the actual time it takes to guess the password is going to be somewhere between my theoretical time and xkcd's theoretical time.  So, while on a surface level, xkcd seemed way off, the approach to selecting a good password with four random words isn't actually half bad and is a considerable improvement over today's passwords that users select.  And we are all for anything that improves the passwords that users select...right?

If the computer generates dictionary-based passwords for the user, then that will eliminate the human equation.  A human, following xkcd's rules, would likely select words of things that they can see, hear, and touch in their immediate vicinity - making it easier for a hacker to build a small attack dictionary.  A quick glance around me and I get "computerkeyboardvideogamenetflixmoviewikipedia" - a great password from the algorithm's perspective except for the fact that I'm sitting at a computer with a keyboard, I love playing video games, I have a Netflix DVD movie in front of me, and I visited Wikipedia recently.  So, even though I'm sure it has terrific entropy, it is a terrible password.  Hence the need for a computer to generate dictionary-based passwords.

One thing I've noted with so-called "password strength meters" is that they are generally useless.  The server-side doesn't enforce password strength requirements.  Part of the problem is that no one seems to have come up with a solid algorithm (other than NIST) and no one has actually said, "here is how you should use this password strength meter..."  This combination is useless to the average server-side programmer, so the only thing I've seen done is display the meter to the user and hope they make a strong password.  Ha!  As if that will ever happen.

What should actually happen:  Calculate the number of bits of entropy in a password using a good algorithm and then apply a minimum threshold.  (And get rid of the silly password meter.)  If the password exceeds the threshold, then the password is strong enough, otherwise the user has to try again.

This approach allows a web forum owner to choose a minimum threshold of 18 bits of entropy and an online bank to choose a minimum threshold of 40 bits of entropy (or more).  Each industry requires different password strengths.  A password strength meter, if you really need one, could then become useful for prevalidation against the threshold.  After all, the user only cares to know the answer to the question, "Will my password be strong enough so I won't have to resubmit this form?"

I'm still nitpicking at my algorithm, which is one of the reasons why it isn't being published (yet).  So, don't get your knickers in a twist over my "hand-waving" claiming to have a solution and not playing nice in the kiddie pool and sharing said solution.  The algorithm will be released in due time.

Another example:  Google recently published an online safety guide on their homepage.  This guide contains a YouTube video and other similar textual instructions on creating so-called strong passwords.  The YouTube video shows "Garden1ngF@n" as a "strong password".  For that password, my base algorithm gives 22.8 bits of entropy, but the extra tests portion of my algorithm calculate it as having around 16 bits of entropy.  Yikes.  Seems really broken to me.

So it looks like I'm onto something here.  My algorithm is blasting apart even Google's recommendations for strong passwords.  Woot!

Read the next part in this three part series: Calculating Password Strength: Part II