strong passwords

People are beginning to hear about this idea of using words and spaces to make strong passwords instead of crazy characters. Cases  in point: “fluffy is puffy” is more secure than “J4fS<2”, and “correct horse battery staple” is more secure than “Tr0ub4dor&3”, while the more secure passwords in both cases are easier to remember.

When people see stuff like this, they seem to make a few mistakes due to not really understanding the principles behind this. Using the word method is useless if your password ends up being short (less than 12 chars), or if you use a phrase (“happy go lucky”), or if you  draw from a limited set of words, (“five three two nine six four eight ten three two”) .  Using only common words is bad too.

Here is one way to think about why: the security of your password can be measured by the number of possibilities. Traditionally, this has been measured character-by-character. So in a lowercase letter only password, there are 26 possibilities per slot. A six slot password  has 308,915,776 possibilities (which is not very secure).

“hsufbe” = 6 chars, 6 slots
(26)^(6) = 308,915,776

The problem is that that is only true if password guessers work like this: aaaaaa, aaaaab, aaaaac, aaaaad… and so on. But if your password is “happy!” then it’s going to get guessed by a dictionary attack much much sooner.

Therefore, we need to make a more general rule:

(values in category) ^ (slots) = possibilities

If you’re working with characters on the keyboard (e.g., these: abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 01234567890 `~!@#$%^&* ()_+-={}[]\|;’:”,./<>?) then you have 95 values in the category. A six character password is then 735,091,890,625 combinations.

“h&’8,}” = 6 chars, 6 slots
95 ^ 6 = 735,091,890,625

While this is better, it’s still not fantastic. But here’s where English words really come in handy. There are between 300,000 and a million English words, depending on your dictionary and how you define words. Let’s use the lower range and assume 300,000 possible values per slot (the slots are the WORDS now, not the characters).

“fluffy is puffy” = 15 chars, 3 slots
300,000 ^ 3 = 27,000,000,000,000,000

“correct horse battery staple” = 28 chars, 4 slots
300,000 ^ 4 = 8,100,000,000,000,000,000,000

The important thing to notice here is that we’re not calculating by characters anymore–a brute force cracker would have an impossibly hard time. But a dictionary attack cracker is going to have the best shot, so that’s what we’re looking at.

Even though you’re using words with 5 and 6 characters, you don’t get to count each character as a slot: they get chunked into one slot. Similarly, if you use a phrase, even though you’re using multiple words, you don’t get to count each word anymore: they get chunked into a single slot of phrases. I have no idea how many common phrases there are, but I’m sure there are password programs that take sentence fragments from the internet and try them as passwords. What is the probability that such a program will hit upon the phrase “jimmy crack corn?” Hard to say. If it’s drawing text from a transcript of Pinky and the Brain, then your odds might be pretty bad. But the big point is that your slots have now been reduced to 1. Let’s assume that the cracker is drawing from a trillion phrases.

“jimmy crack corn and i dont care” = 32 chars,  1 slot
( 1,000,000,000,000) ^ 1 = 1,000,000,000,000

Not very good. Barely better than a 6 char password, even though it’s 32 characters and 7 words long.

So: things to keep in mind. Draw your “chunks” or slots from categories with very large number of values. The more the better: drawing common English words is ok if you use a lot of them. Drawing from a larger range of English words (e.g. include scientific words, place names, proper nouns, stuff that would get you disqualified from Scrabble) means you can get away with using less slots. If you also use other languages, you’re even better off.

But also remember that you’re always limited by the less sophisticated password cracking algorithms. The following  words are all extremely uncommon words drawn from various languages and various technical terms: xi af ju . Let’s assume that for some strange reason you’re somewhat familiar with these words and so it’s easy for you to remember. So you might think:

“xi af ju”  = 3 slots
(~6 million )^ (3) = (absurdly large number)

but in fact

“xi af ju ”  = 8 slots
(27 )^ (12) = (282,429,536,481)

So you’ll beat the sophisticated dictionary attack but lose at a persistent brute force attack. Likewise, you may have several words that are all part of some similar category (e.g. numbers, as from the example above) in which case you now only have ten values per slot, even though each slot is multiple characters long. Similar story if you happen to choose all words that are in the 1,000 most common words, because the dictionary program may be using only those 1,000 common words, reducing your values per slot from 300k to 1k.

Lastly, and this should be obvious, but once a random assortment of words or characters goes on the internet or becomes famous, it effectively is the same as a word or a phrase. If you see an example of a secure password on the internet, (here you go: “D&hjd6G44@#46″;}{neh*(Jeheg$#@EfTGTgSYhs” ) it automatically ceases to be secure because some programs build their dictionaries from the internet. That means that that “secure” password back there is no longer secure. So you can’t use “fluffy is puffy” or “correct horse battery staple” anymore. And really, you can’t use any password that has any google results if you google it (in quotes).

This is just one small part of password security, especially compared to problems like people reusing passwords for more than one site. But if it’s learned correctly, it can help solve the problem by creating easier to remember passwords and encouraging people to create unique passwords for each site they visit.