Passwords Are Terrible (Surprising No One)

This is the result of a security audit:

More than a fifth of the passwords protecting network accounts at the US Department of the Interior—including Password1234, Password1234!, and ChangeItN0w!—were weak enough to be cracked using standard methods, a recently published security audit of the agency found.

[…]

The results weren’t encouraging. In all, the auditors cracked 18,174—or 21 percent—­of the 85,944 cryptographic hashes they tested; 288 of the affected accounts had elevated privileges, and 362 of them belonged to senior government employees. In the first 90 minutes of testing, auditors cracked the hashes for 16 percent of the department’s user accounts.

The audit uncovered another security weakness—the failure to consistently implement multi-factor authentication (MFA). The failure extended to 25—­or 89 percent—­of 28 high-value assets (HVAs), which, when breached, have the potential to severely impact agency operations.

Original story:

To make their point, the watchdog spent less than $15,000 on building a password-cracking rig—a setup of a high-performance computer or several chained together ­- with the computing power designed to take on complex mathematical tasks, like recovering hashed passwords. Within the first 90 minutes, the watchdog was able to recover nearly 14,000 employee passwords, or about 16% of all department accounts, including passwords like ‘Polar_bear65’ and ‘Nationalparks2014!’.

Posted on February 1, 2023 at 7:08 AM85 Comments

Comments

Ted February 1, 2023 8:02 AM

People Are Terrible (Surprising Bruce Schneier)

Passwords, combined with 2FA, are great as long as you actually choose strong passwords. People will always be the weakest link in security, which is why phishing and other social engineering attacks are so successful. It doesn’t matter if you replace passwords with something else. Need to click a notification on your phone to login? People will simply be tricked into clicking the little notification thereby giving the hacker access to their accounts.

jbmartin6 February 1, 2023 8:06 AM

16% crack rate on dumped NTLM hashes is fairly normal. According to the article, the point of the exercise was merely to counter claims by the DOI that it would take 100+ years to crack any of the hashes because of their password policy.

bert February 1, 2023 8:15 AM

@Ted (the first one):

Need to click a notification on your phone to login? People will simply be tricked into clicking the little notification thereby giving the hacker access to their accounts.

This is not true. Such an attack would be exponentially more difficult because today’s phone OSes are heavily sandboxed and you can’t just send a “notification to log in” if the user’s using passkeys.

Anonymous February 1, 2023 8:17 AM

Worrying if they can break Polar_bear65
That’s 12 characters with an underscore.
Either the stringing dictionary words attack approach is more flexible and effective than I would expect or they have a very big hash table.

Alan Kaminsky February 1, 2023 8:22 AM

In all, the auditors cracked 18,174—or 21 percent—­of the 85,944 cryptographic hashes they tested

In other words, the auditors failed to crack 67,770—or 79 percent—of the hashes they tested. Over three-quarters of the passwords were too difficult to crack with a dictionary of “over 1.5 billion words”. A goodly percentage of the accounts would appear to be using strong passwords.

Bob Easton February 1, 2023 8:27 AM

Hmmmm… an easily found “Password Strength Meter” says it will take 5 years to break ‘Polar_bear65’ while this article states less than 90 minutes. Something doesn’t compute.

Ted February 1, 2023 8:33 AM

@bert

This is not true. Such an attack would be exponentially more difficult because today’s phone OSes are heavily sandboxed and you can’t just send a “notification to log in” if the user’s using passkeys.

I was referring to so called “passwordless authentication”. I’ve seen a small number of websites that you can login to simply by clicking a notification on another device, or by typing in a short code you received via text or email. In other cases it may require a biometric factor, such as scanning your face or fingerprint. Point is, people will still be tricked into giving access to hackers, and it won’t require a password.

PHP February 1, 2023 8:49 AM

A standard old Nvidia 1080 graphics card can bruteforce NTLM hashes at a rate that allows me to try all possible upper/lower/numeric in less than an hour. Newer cards are way faster.

Now, when you have tried that, then hashcat supports rulesets, dictionaries etc, using that with a few dictionaries and rules describing different word separators, numeric postfix etc, then you can quickly geta bit further. And keep adding the found passwords to the new wordlist.

MFA works to some degree. Yubikeys are fine for admins – they ensure end to end validation.

But SMS, Authenticator App, Passwordless etc is pretty bad. Here a real-time Monkey in the middle phisher can do a login on the backside, and get access and renewal tokens that are valid for maybe 30 days – maybe even longer. And the user will not know.

If the hacker is smart and targeted, he runs this on a datacenter close to the victim, thus he might not trigger impossible travel, unsuaul signon properties etc. Thus it will be very har to detect.

The current phishing runs are redirecting people to some O365 stuff after the phishing, where many users will be signed in with SSO, and thus they never know they visited a hacker page. Currently Microsoft Planner docs are the thing – Phishing and redirect to tasks.microsoft.com.

bert February 1, 2023 8:52 AM

@Ted
I know what you were referring to, that’s why I mentioned passkeys.
You probably know how they work because you’re a commenter on this blog.
Do you really think protecting an account with a password and 2FA is as secure as using passkeys? Passkeys are designed to eliminate the “human factor” in authentication as much as possible, which you (rightfully) claim is the weakest link!

Kent England February 1, 2023 10:08 AM

Steve Gibson made some good points on passwords recently:
1) slow roll credential-stuffing attacks and limit attempts to a reasonable number
2) blacklist cloud networks and bad actor networks
3) use MFA sparingly. use the persistent cookie and limit MFA to once a month or longer

If websites did these things, password cracking and reuse wouldn’t be such an issue.

AlanS February 1, 2023 10:15 AM

Aren’t the Feds requiring agencies and contractors to use FIDO2 hardware security keys to mitigate against weak passwords and weak 2FA? See Zero Trust. And Google, Microsoft and Apple appear to have started to push use of FIDO2 passkeys for the same reason.

Ckive Robinson February 1, 2023 10:36 AM

@ Bruce, ALL,

Am I the only one to recognise that we have a “passwords are bad” story atleast once a year if not more frequently?…

I think it’s safe to say that the problem is not the technology –although it is mostly bad– but the usual “Nut behind the Wheel” of a human, be they a user or administrator.

There is a reason why bank PINs are only four digits… And yup people still forget them…

There are times when I think the XKCD $5 wrench cartoon[1] should be re-done… And people have their password tattooed in the scalp by repeated application ofthe wrench 😉

[1] I went to DuckDuckGo to search for the XKCD link and used the usual,

[XKCD $5 wrench attack]

As the search string. This time rather than cough up the required link it pulled up lots and lots of stories from “crypto-bros” who have been grabed and “$5 Wrenched” to get their crypto-wallet password…

Ted February 1, 2023 10:36 AM

@AlanS

Aren’t the Feds requiring agencies and contractors to use FIDO2 hardware security keys to mitigate against weak passwords and weak 2FA?

I’m trying to figure out how prevalent PIV cards are within the DOI. This was from the IG report (still reading):

The most common MFA method the Department has implemented within the AD is a PIV card issued to all employees, which combines a digital certificate contained on the card and a PIN by each employee. When this MFA method is properly implemented, employees do not need to use their AD password.

Winter February 1, 2023 10:37 AM

@Bob Easton

an easily found “Password Strength Meter” says it will take 5 years to break ‘Polar_bear65’ while this article states less than 90 minutes.

Looked up a password strength meter and it said ‘Polar_bear65’ corresponds to 6 x 10^14 guesses (weak) ~ 50 bit strength. That is weak.

Tom February 1, 2023 10:57 AM

I’ve always wanted to test my master password but have never trusted the “password strength test” sites to not be harvesting. Am I wrong?

yet another bruce February 1, 2023 11:39 AM

Variability from department to department and between staff and management was interesting. Kudos to OIG. Interior Business Center seems like a squishy target with lots of potential exposure for client departments. I wonder why it is so notably bad.

AlanS February 1, 2023 11:49 AM

@Ted

I have no idea how prevalent there use in federal agencies is. They also appear to have some issues. See Krebs. I suspect the Feds requiring a security control and its implementation may be eons.

To add the the earlier post, passkeys appear to be a quicker route to more widespread adoption of Webauthn in the consumer space but because they can be backed up and copied appear to have some obvious vulnerabilities compared to hardware keys where the private key remains on the key.

JonKnowsNothing February 1, 2023 11:52 AM

@Clive, All

It’s more than a problem of “bad password” selection, it’s inconsistencies in applications across everything on the internet.

Many sites will no longer take 4 digits but they will take 6.

Some require 2FA and send a msg to “select your destination”.

Some require an Authenticator. Good, bad, or indifferent, if you want to access that site you gotta use the Authenticator they designate.

As Clive pointed out, the more complex the password, the less ability people have in remembering them, so they opt for a Password Manager to remember for them. This password manager is a target so juicy, every LEA and Crook on the planet if phishing for them. Once they get access it’s Game Over.

Then there is a teeny problem with LEAs Everywhere. There are now mandatory forms that people have to fill out giving: Name, Rank, Serial Number, Account, Password, Security Token ID, Alias Names, among others bits of information. The failure to cough up these details and every detail demanded results in Unpleasant Things Happening To You and Your Family. The reason LEAs demand you unlock your phone when they scream their demand to do so, is they know your auto-login credentials are On The Phone. Secure Passwords, until your fingers are smashed or you are tossed unclothed into freezer or denied food, water and toilet. Phones of themselves are password managers along with the apps on them.

If your social media account password is really complex enough that you cannot remember it, you aren’t going to have any working flexible digits left.

  • SWAG: People (USA) probably have upwards of 100 services requiring password access.

Some services get discontinued and restarted, others may remain constantly active. Some are gatekeeper passwords that open other password protected systems (game hubs, AppStore). So it might be a much larger count of all the passwords people are managing.

People often limit the discussion of passwords to the internet interface, but if you need to Start-Stop-Change Utility Services you may need a verbal “identifier code” (aka secret question) when talking over a phone line.

A Low Strength Password is better than Strong Strength Password if you want to keep your fingers intact.

Gilberto February 1, 2023 11:58 AM

@Ted,

I was referring to so called “passwordless authentication”. I’ve seen a small number of websites that you can login to simply by clicking a notification on another device, or by typing in a short code you received via text or email.

It’s actually a large number of sites, because the “forgot my password” button often does exactly that. I worked with someone who, every time they were presenting a meeting via Webex, would go to that site, click the “forgot” link, and log in via the resulting email.

a digital certificate contained on the card and a PIN by each employee. When this MFA method is properly implemented, employees do not need to use their AD password.

Yeah, I thought the Feds were using smartcards, but it seems the Common Access Card is just for the military and a select few agencies (including the Public Health Service and the NOAA).

The last sentence about the AD password worries me a bit. Why make users have a password if they’re not gonna use it? A password that’s needed, but only rarely, is going to be weak, forgotten, written down, or reused from elsewhere.

Winter February 1, 2023 12:00 PM

@Tom

I’ve always wanted to test my master password but have never trusted the “password strength test” sites to not be harvesting.

There are 10 types of “password strength test” sites, those that work in the cloud, and those that work in the page (JavaScript).

The former will harvest passwords, at least to feed their password strength algorithm. The latter claim they do not send any passwords over the internet. You can check that by downloading the page and running it without an internet connection.

An easier way to do test your master password securely is to create an equally long password with the same structure and comparable words. That should give you a good estimate of your password strength without divulging your real password.

EvilKiru February 1, 2023 12:37 PM

@Tom: I think it’s safe to assume that any password you check will end up in a rainbow table used for password cracking.

@yet another bruce: It’s probably not notably worse than anywhere else where humans are forced to come up with their own passwords.

lurker February 1, 2023 12:42 PM

@Ted (whichever)
“People will simply be tricked into clicking the little notification…”

I’ve often wondered just what the dickens does happen with all those GDPR cookie notices, how easy would it be to spoof one, and how come too many don’t [Save my Preferences].

@Kent England
“slow roll credential-stuffing attacks and limit attempts to a reasonable number”

That is so easy to do there can be only one reason why it’s not done: sysadmins don’t want the hassle of dealing with the doofuses who can’t read the stickit under their keyboard.

Winter February 1, 2023 12:50 PM

@EvilKiru

I think it’s safe to assume that any password you check will end up in a rainbow table used for password cracking.

There are many JavaScript only password checkers [1], even with code available.

And if a known university assures us their password checker runs on the client computer without internet transactions [2], I trust that. The fallout of someone finding out they lied would be way to serious.

[1] ‘https://www.jqueryscript.net/blog/best-password-strength-checker.html
‘https://www.section.io/engineering-education/password-strength-checker-javascript/

[2] ‘https://www.uic.edu/apps/strong-password/

mark February 1, 2023 12:51 PM

Just one issue: how did such passwords get put in place? Everything I use, from my hosting provider to my partner’s Win 10 box, checks new passwords, and refuses them if they consider them weak. Certainly my Linux box does….

Jordan Brown February 1, 2023 12:56 PM

Password strength meters and their estimates have almost no value, because they assume a particular model for password structure.

Is your password in an existing password leak? You’re dead, no matter how “good” it is.

Is your password built out of enough bits of randomness? You’re probably OK, no matter what characters it has in it.

How strong is “Polar_bear65”? A naive analysis might say that it’s got 12 characters from the four food groups, so it’s got 78 bits, which is really good. But almost nobody really uses a random 12-character full-character-set password, because they are impossible to remember and painful to type. (Password managers excepted.)

So the attacker tries other patterns, like combinations of words.

If the attacker tries a pattern

word punctuation word digit digit

where the words might or might not have an initial capital, then Polar_bear64 has maybe 40 bits of randomness – about 13 for the word, one for initial cap, 5 for the punctuation, 13 for the word, one for the initial cap, 7 for the digits.

Maybe they try a hundred such patterns, so that’s another 6-7 bits, for a total of ~47 bits… enormously less than the naive analysis says.

Then you have the naive analyses that say that “caps spin gust lend vary wins duel” is awful because it doesn’t have the four food groups… when it really has 70 bits of randomness.

Create your passwords randomly and ensure that they have enough bits of randomness. (Is 50 enough? Against a national security agency that wants to get you in particular? No. Against anybody else? Maybe. Also: will it be cheaper for them to break into your house and install a keylogger?)

Clive Robinson February 1, 2023 1:15 PM

@ Tom, ALL,

Re : Password Checkers

“I’ve always wanted to test my master password but have never trusted the “password strength test” sites to not be harvesting.”

Even if they don’t test they will give you a falsely high value of it’s strength.

lurker February 1, 2023 2:21 PM

@Jeff M

So instead of merely cracking the SHA-1 to get your password, the bad guy has to crack 475 SHA-1s and run them through his credential stuffer. I guess that’s an improvement.

Clive Robinson February 1, 2023 2:31 PM

@ Jordan Brown, ALL,

Re : Admin Rules reduce strength.

“… is awful because it doesn’t have the four food groups”

Each group reduces the “potential” password strength. I know this surprises many people but it’s true and I’ve had a number of arguments over the years.

I’m glad to see others can look straight at the chalk board 😉

@ ALL,

Lets take a simple eight character password. Where the entire character set is

Punctuation 30
Uppercase 26
Lowercase 26
Digits 10

For a total 92

So 92^8 = 5.1322 10e15 or ~54 bits

Now, lets put in a “must contain a digit rule.

So 92^7 x 10^1 = 5.5785 10e14 or ~45 bits

You can keep going by insisting those other groups but the maximun strength the password can have goes down.

So why have the rules?

Because your average human will use one five or six letter word and a couple or three digits.

Which looks like,

So 26^6 x 10^2 = 3.0892 10e10 or ~34 bits

Or 26^5 x 10^3 = 1.1881 10e10 or ~33 bits

But thats random letters not words. So ask the question how many five letter words are there in the average humans “in their head dictionary” and the answer is going to be less than five hundred or so on average.

So 500 x 10^3 = 5.0 10e5 or under 19 bits.

With one other twist, when you have to use a punctuation chatacter nost people will go for a hyphan or underscores…

So Polar_bear64 is actually very weak. As it’s actually a single word in most peoples heads so one of around 8000, also 64 is a square number of which there are only 10 in zero to one hundred. And the underscore only adds a bit,

So 8000 x 2 x 10 = 160000 or a little over 17 bits…

Which is not very much entropy at all, which is why such passwords get broken easily. But they are about as far as the average human can remember…

Oh and remember Admins are lazy, the username often contains the users initials and two or three of the first letters of the surname. If the user as so many non technical users do, use their own name in their passwords…

Kind of like “shooting fish on a barrel” but without any water in the barrel…

Jordan Brown February 1, 2023 3:05 PM

So 92^7 x 10^1 = 5.5785 10e14 or ~45 bits

That’s if the digit is forced to be at the end (or any other fixed position).

You get three more bits because it can be in any of eight positions.

But yes.

Ted February 1, 2023 3:55 PM

@AlanS

I have no idea how prevalent there use in federal agencies is… I suspect the Feds requiring a security control and its implementation may be eons.

Apparently the OCIO wasn’t aware what specific systems were enforcing MFA. They had relied on the different bureaus and offices to self-report. Nobody captured the individual systems.

Per the IG report, using single-factor authentication runs against 18 years of mandates from NIST, DHS, EO’s and the dept’s own policies. (p11)

The IG had 10 recommendations.

The first recommendation was to implement PIV or other approved MFA methods, starting with HVA’s. For this, the DOI provided a target implementation date of December 30, 2024. (p25)

I’m really curious what the hold up has been. Money?

David McClain February 1, 2023 4:58 PM

I don’t understand the continuing use of passwords. The Signal X3DH protocol provides for password-free connections among all parties. All that is needed is a public key from each participant. If these leak out, there are no undesirable consequences. (I hate the tower of passwords we have now, surprising nobody…)

Clive Robinson February 1, 2023 7:18 PM

@ JonKnowsNothing, ALL,

Re : The $5 approach.

“If your social media account password is really complex enough that you cannot remember it, you aren’t going to have any working flexible digits left.”

Back when @NickP was still around, he and I used to discuss things like this.

One thing we came up with was a short life self destructive key that you provably did not know. The real key or password was generated from apparently random daya sent to you by three or more parties each in their own jurisdiction outside of the jurisdiction you are in.

The idea was that after a relatively short period the key would be automatically deleted. However you still had to use an unlock key to use it. It sounds complicated but actually it’s relatively simple to draw out.

Since then I’ve thought further about authentication factors. In general we’ve been taught there are three,

1, Something you are (biometric)
2, Something you have (token)
3, Something you know (passphrase)

The first “standard biometrics”, DNS, Finger prints, and other external body shapes etc have always been a very bad idea. But they are loved by a certain authoritarian mind set who have neither the morals or the ethics to be involved with humans, security, or privacy, they are frequently the baser form of guard labour and see an identity number not an individual. Put simply biometrics turn a part of you into a token you have no control over and the lack of control is what makes biometrics both useless and extraordinarily dangerous for an individual.

The second or “tokens” come in two forms, where “control” is the key differentiator. A badge, card key, or car key lack control they will work for anyone who holds it. Most are now realising that this is not the best way to go about things, luxury car keys being a very real problem on this respect. Yes they are convenient but convenient for who is the question of importance. To limit such usage some tokens come with control, the simplest being a panic or deactivate button that once pressed stops the token from functioning untill properly reset. Others have keypads or similar on which you type unlock or other usage codes.

Some tokens are of a level of sophistication that they become in effect a remote interface for the third type of authentication.

Something you know is in effect the only method of authentication that is deniable and more intetestingly can be augmented in ways that make them quite secure via the likes of duress codes, retry lockouts, timeouts and similar. Once a system is deactivated, if designed propperly then it becomes usless.

However the main assumption with “something you know” is that it is effectively a pass phrase or password, and works at any time.

After thinking on this I realised that something you know does not have to be some complicated string of charecters. It can with todays modern equipment include a time and a place.

That is the control function such as say “unlock” can only be done at a certain time, place, direction, altitude, or some combination there of. These can also change with time, from say just behind your front door in New York to being at the top of the Eiffel Tower in Paris facing south. Further you can chain them such that you have to visit known to you only places in France, Belgium, Holland etc in say less than twelve hours without going above 1000ft at any time.

Thus you can “force a stalemate” against any “authoritarian” “Might is right” type.

Clive Robinson February 1, 2023 7:37 PM

@ Jordan Brown, ALL,

Re : positions of characters.

“You get three more bits because it can be in any of eight positions.”

In theory yes, for the first rule but each one goes down so the first is 1:8 the second 1:7 the third 1:6 and so on.

But with double digits they are not often spread appart unless it’s simple to remember like ‘6Polar_bear4’. But you also need to remember humans being what they are, are not likely to even do ‘Po64lar_bear’ by choice. Overwhelmingly they would do ’64Polar_bear’ or ‘Polar_bear64’ bringing it from down to just a 1bit choice.

We have to accept that it does not matter if you call it lazyness or convenience, most humans will do what is easiest for them, which carves bits of security “Faster than a mad cloths peg whitler on peice rates”.

JonKnowsNothing February 1, 2023 10:09 PM

@All

While considering the problems with the passwords, pass codes, a small dimple or pimple is the HIDE_ME feature on the password input line which comes in a few variations.

  • There is Open_EYE which will let you verify the entire line you before you hit ENTER-SUBMIT and kill one of 3 attempts before your account gets a lockout.
  • There is the ‘*’ replacement, which hides each typed character just moments after you type it. There is no Open-EYE to show the entire line which gets replaced by ‘***’ as go.
  • Some sites do not let you Paste into the password box, so you can’t get enter it from a good source. Plus the problem of EOL hidden char that can get carried along.

Not only do we want people to be able to remember gibberish as a password, we want them to be perfect typists too.

Clive Robinson February 1, 2023 11:56 PM

@ pd, All,

“Hardware keys.”

Are in most ways as bad if not worse than bio-metrics.

Hardware keys get used with accounts as,

“One ring to rule them all and in the darkness bind them”

Effectively once LE’s or IC grab your token and force you via various means legal, psychological, or physical to unlock it, every account you own, then belongs to them, to do with as they please (and people wonder why I don’t do social-media).

As I indicated above we need to do a lot further thinking on the three forms of authentication. Especially how “what you know” can be used to protect yourself from the “Might is Right” coercion. Via “Guard Labour” and worse, using various means legal, psychological, or physical to persuade you.

Oh and don’t be surprised if another William Barr type steps up and demands all “Hardware Keys” and physical tokens must have a Law Enforcment back door.

Some of us are old enough to have seen similar behaviour before from the likes of Louis Freeh when Director of the FBI (arguably worse than J Edger Hoover when it came to desire for the surveillance of everybody).

Anonymous2 February 2, 2023 4:07 AM

@Anonymous
Dictionary attacks are very powerful. Especially if you combine it with popular methods of creating simple word based passwords (for example one or two words and a number) and you make it into a hash table. Password “Polar_bear65” is only 2 words (“Polar”, “bear”), number (“65”) and a special character as a space substitute (“_”). There is only something about 170000 english words in use today (you can further refine it by removing uncommon words). If you use words in password, think of whole word as if it was a single character of a very long alphabet (340000 characters long, containing words with and without first capital letter). For comparison, 4 character long, lower case alphabetical password has about 456 976 possible combinations.

Winter February 2, 2023 9:55 AM

@Anonymous2

If you use words in password, think of whole word as if it was a single character of a very long alphabet

tl;dr: Go for long passphrases, not complex passwords.

Indeed. If you convert the 170000 words into bits to guess, you get ~17 bit per word. Capitalization adds 1 bit. So, think of a passphrase of words as giving you 18 guessable bits per word. Adding 2 digits before or after adds another 8 bits. So, “Polar_bear65” = 18+18+8 = 44 bits. There are 3 options to combine these 3 components: “”, ” “, “_” on 2 positions ~ 3 bit extra. All in all you end up with less than 50 bits to guess on the assumption it is two words and a two digit number.

Working this way, you would add 20 bits per word (17 (word) + 1 (capitalization) + 2 (“space”). To get a 90 bit strong password you would need 4 words + a two digit number.

The above is overly optimistic as you will be unable to make a good selection from 170000 words. More likely, you will be hard pressed to even select from 4000 words ~ 12 bits. So, in that case, you will need ~5 words and a two digit number to reach 90 bits password strength.

To summarize, make a long passphrase, e.g. 8+ words.

Emoya February 2, 2023 12:12 PM

@Winter, Anonymous2

Don’t forget that bit security assumes drawing randomly from a set. Polar and bear are very closely related in their everyday use because combined they refer to a single object/idea, and as such, are more likely to be used together. If it were me, I would almost consider it as one word.

Winter February 2, 2023 12:39 PM

@Emoya

If it were me, I would almost consider it as one word.

I think it is one word. You are tight about the rest too.

To get realistic estimates, a language model is needed too, that gives probabilities of word sequences. ChatGPT is such a model. But I did not want to complicate things too much.

In reality, it boils down to an arm’s race. To me, the solution is “In case of doubt, add a word to your passphrase”.

Emoya February 2, 2023 1:01 PM

@Clive, pd, All

Misuse of authority aside, another thing that scares me about hardware keys is account recovery in the event of loss/theft. IIRC, a few months back Bruce posted a thought experiment in which all avenues of authentication were lost in a house fire, leaving no recovery options.

Also, as Clive pointed out, biometrics are essentially tokens that cannot, under normal circumstances, be lost. However, “under normal circumstances” does not mean impossible, improbable, or even unlikely. Some people are exposed daily to conditions that could cause them to lose or alter one or more biometric attributes.

Anything that physically exists could potentially be lost, stolen, or replicated. The only option safe from these vulnerabilities is something that does not exist physically.

What it boils down to is cheese, swiss to be specific. Every identified form of authentication has its strengths and weaknesses, covering holes left by the others. Eliminating any single factor from a layered system weakens the overall system. The problem with many so-called MFA solutions today is that they are not truly layered, but rather expose more vulnerabilities by offering more points of compromise.

Ideally, forms of authentication should exist only within the control of the owner, not the service. This forces every attack to be against a single individual, minimizing its scope while maximizing its cost.

This can be achieved for passwords, etc. through zero-knowledge proofs.

FIDO/passkeys have done a decent job of realizing this for devices/tokens, but the assumption that any user biometric authentication to the device is legit also requires assuming that the user acted of their own volition and that the device itself is not compromised.

It’s practically impossible for biometrics, as they are inherently a public trait. We leave and/or expose them everywhere we go.

Jordan Brown February 2, 2023 1:23 PM

Dictionary attacks are very powerful.

Yes and no. For small numbers of words, yes. For larger numbers, no, they are no more powerful than knowing that there are 95 printable ASCII characters – a dictionary attack on ASCII.

My preferred passwords these days are pass phrases picked randomly from a collection of 1024 common four-letter words. That’s a nice tidy ten bits per word. I usually separate the words by a space if the service will let me. (Sigh, some won’t.)

bone dogs only show acts base

is easy to type and not awful to remember, and has 60 bits of entropy.

Of course, some services demand that I add an upper-case letter, a digit, and punctuation. Those don’t really add value, but don’t hurt much.

The worst are the ones that say “that’s too long”.

Alain February 2, 2023 3:13 PM

Maybe I’m naïve, but are hashes not “smart”?

aka

using a organisation seed
+
an extra personal seed
+
a compute intensive hash calculation

This makes directory attacks organisational and personal and intensive to compute.

–> for every person the directory has to be computed.

lurker February 2, 2023 4:01 PM

@Jordan Brown, “common four-letter words”

A certain well-known organisation once had a purge and for some specious reason insisted I must change my password. I used three nsfw four letter words, only three because they were also insisting on max. length of 14 chars. They haven’t complained since …

Clive Robinson February 2, 2023 7:27 PM

“is easy to type and not awful to remember, and has 60 bits of entropy.”

Actually it’s less than 60bits.

Because the sentance is effectively plaintext language.

To see what I am saying, lets use a numeracal analog,

Take all the numbers from 00 to 99 and apply the following “human trait” analogues,

1, Remove all duplicates digits numbers (00, 11, etc).
2, Order the digits from low to high (so 91, 72, etc become 19, 27, etc).
3, Remove all duplicate numbers.

You are now down to less than half the numbers left, so you’ve lost more than a bit… There are other human traits you could still apply like remove,

1, primes
2, multiples
3, squared numbers
4, even numbers

You could quickly end up with just ten or so numbers from the original hundred. As an attacker you would try those numbers first as it would significantly shorten the brute force search time.

The odds of just four randomly selected words from a list of a thousand words making sense is extrodinary low. But a human will either “click again” untill they do, or rearange them to make some kind of sense out of them.

When you consider that you can almost hear the bits falling like large rain drops…

An attacker will go down the “human trait” paths first, thus gain by that reorder and deduplicate word bit loss which is going to be way larger than with the bit loss on two digit numbers.

In effect it’s nolonger a “brut force” attack but an assumed “known plaintext” attack, with humans always selecting the weak keys…

JonKnowsNothing February 2, 2023 8:34 PM

@Clive, All

At the risk of sounding dumb…

When considering Alphanumeric Text in Any Language regardless of the combinations of letters and numbers etc, isn’t all converted into binary on entry? On paper it looks fine but it’s all 10110s when it hits the registers. So there aren’t any numbers or alphas at all.

Alphas and numbers are framing constructs. You can start and stop in any place in the binary stream. The resulting output may not be intelligible if the frame is offset but there aren’t any definitive delimiters.

ex: Binary, Octal, Hex

===

Search Terms

Positional notation
Positional systems in detail

Computer number format is the internal representation of numeric values in digital device hardware and software

Jordan Brown February 2, 2023 8:56 PM

Actually it’s less than 60bits.

Um, no, it’s exactly 60 bits. I didn’t pick those words by hand. They were each randomly chosen from a list of 1024 words, so they each represent 10 bits of entropy.

Six four letter words chosen by a human probably don’t have 60 bits of entropy. Six words chosen randomly from a list of 1024 words do.

Here’s two more examples…
wade area seat bits give slim
fame pity tips kick when slid

Garabaldi February 3, 2023 2:14 AM

With the possible exception of Alain you are all assuming that there is nothing wrong with leaking inadequately hashed passwords. If sysadmins cannot secure the single most important file on their system there is basically no hope.

This post really should be retitled “Once again ‘security professionals’ find their jobs are too hard, delegate their responsibilities to amateurs, and are shocked that does not work so well.”

Clive Robinson February 3, 2023 4:04 AM

@ Jordan Brown,

“Um, no, it’s exactly 60 bits”

Sorry you are making wrong assumptions, and the reasoning behind why you are wrong has been known for oh at least one and a half millennium[1] if not a lot more.

The entropy that is important is not the sum of the components of the password as you imply, it’s actually almost irrelevant and is at best some kind of “high-water mark” below which reality hangs,

The entropy that is important is that of the resulting password usage and it’s entropy can be a lot lot lower than you think and actually difficult to quantify the actual low-water mark.

The only way the passwords can have the 60bits of entropy you claim to an attacker is if they all have equal probability of being used and importantly have no weak passwords, the attacker searches for first.

The minute you start taking passwords out of contention for use the entropy drops down. You can thus calculate an approximate “low-water mark” based on human traits.

To understand this ask the question,

“What entropy does each letter of the alphabet have based on usage in the English language?

Because the usage is not 1/26 or 3.85% for each letter. E for instance is up above 13.5% and Z down around 0.05% depending on which collection of texts was used.

You can look up “A SIN TO ER” or better ordered “EAT ON IRISH LID” and find out that letter frequency is based on how they are used in words, how words are used in sentances and so on upto pages, works and libraries of written text.

Ask yourself the question,

How many four letter words in a common dictionary?

Is it 456976 of 26^4 (a little under 19bits of entropy)?

No of course not, it’s just a few hundred giving you maybe 9 or 10 bits of entropy.

Do you understand the difference between the two?

Because as I said the “human trait” is to make the output of the generator more memorable, and we’ve known since the 1960’s if not earlier that the majority of users can not remember “random”.

So on mass the average human would reject XQHZ and happily accept DAWN and if given JOMP probably quieltly change it to JUMP, likewise change DWON to DOWN.

Thus reduce the entropy of the actual generator from 19bits down to maybe 9bits.

From an attackers point of view they care not a jot about the high-water mark, they are looking for low-water marks in maybe a hundred thousand passwords in a file. Because finding just one gets them “across the threshold” into the system to carry out the next stage of their attack.

And please do not say,

“If they have the file they are already in the system.”

That is probably the lamest falsehood floating around ICTsec.

For instance back in the early days of “personal web servers” people all to frequently made a setup mistake (which still happens). Where a simple URL would send back any file on the system that matched the URL. The problem if you don’t know the name and path of the file, it’s not going to get you anything other than an error message. However if you want a standard file, all systems of the same OS put the password file or it’s equivalent in the same place, and the OS identifier used to be given out by the web server. Even a script-kiddy could learn all that in an afternoon and back last century that’s what they did. And yes even back then password files were exchangable commodities with value.

The password file would also give the “username” in plaintext which again usually suffers from “human trait” failings, thus can act as a weak identifier across multiple password files.

[1] There is documentry evidence of cryptographic attacks on written codes from Arab mathematician Al-Kindi (801-873) but it is likely letter frequency counting was known long before that due to some religious practices that associated meaning to values of letters (it’s where the “666” sign of the devil in the bible two millania ago which comes from in one belief system, but is the sign of good fortune in another going back maybe four millennia).

Winter February 3, 2023 4:59 AM

@Clive

The minute you start taking passwords out of contention for use the entropy drops down. You can thus calculate an approximate “low-water mark” based on human traits.

I think you missed the important part:
@Jordan Brown

I didn’t pick those words by hand. They were each randomly chosen from a list of 1024 words, so they each represent 10 bits of entropy.

By the definition of entropy=uncertainty (the relevant definition here) this is exactly 10 bits of uncertainty per word. So, 6 words is 60 bits of entropy.

Characters probability have no role to play here. The words could be replaced by numbers 0001-1024, or hex, or 0/1 bits, or 1024 city names in Chinese characters, that would all not matter at all. At least 60 bits of information is needed to reconstruct (guess) the 6 word string, even when knowing everything there is to know about the 1024 word list and the procedure used.

Canis familiaris February 3, 2023 8:45 AM

It’s all a very interesting discussion.

But one of the things that is often missed is how the system works when it is used by all sections of humanity; not just the technologically capable. And how do you resolve problems like losing an authentication device for the third time this month, or running it through the washing machine, or dropping it in the toilet. Just saying, “Don’t do that.” isn’t really helpful when dealing with the scatterbrained, or the senile.

Also, how do the proposed systems work when there is no Internet connectivity. And/or no mobile network coverage. And/or no GNSS coverage. And/or no electric power.

Or how about someone I know who has no hands. Or someone else who is severely affected by Cerebral Palsy, so cannot type. Or talk. Or people who are blind (possibly by macular degeneration) and can’t see any displays on authentication devices. Yes, they are all exceptions, and life would be simpler for you and all the clever people designing authentication methods if you just ignore the edge-cases – but they are also people, and surprisingly enough, as you age, cognitive decline and physical handicaps become more and more likely, so something that is trivial for you at 20 becomes a significant challenge at 80.

Technology tends to make life easier, but be careful about excluding edge cases, because there’s a good chance you’ll become one.

Clive Robinson February 3, 2023 11:11 AM

@ Winter, Jordan Brown,

“Characters probability have no role to play here.”

Sorry you are wrong as I’ve already explained to @Jordan Brown

“The entropy that is important is that of the resulting password usage and it’s entropy can be a lot lot lower than you think”

But again a four character alpha only password has 26^4 outputs or 456976 passwords for a little under 19bits.

I’m assuming you understand that.

Now lets assume for some reason humans don’t like passwords with Q and Z in them so reject aby that come up by pressing the button on the generator untill they get a password with out them.

Thus it’s now 24^4 or 331776 or a little over 18bits as 125200 passwords will never get used.

So each time a human trait or rule brings down the finall password space size the entropy to the attacker drops.

Now if you don’t understand why this is the way it is and why it is a problem I suggest you sit and think about it for a while.

Clive Robinson February 3, 2023 11:52 AM

@ Bruce, ALL,

As the subject of paswords appears to come up every year and certain arguments that are incorrect get made each time, especially over the XKCD method perhaps you could write an article up about it.

@ ALL,

As I’ve repeatedly said trying to work out the real entropy of a password is dificult and importantly it’s not the alleged generator entropy that is important but the attacker entropy.

To see why and why people should stop claiming the XKCD method gives X bit of entropy…

If your dictionary has 1024 words in it then “your dictionary” has 10bits of entropy. However randomly selecting from it four times does not give the resulting password 40bits of entropy.

As an attacker lets say my dictionary is a subset of yours and only has 127 words or 7bits in it. If I randomly call it all the paswords I get will be a subset of all the paswords you generate.

But on the false argument used for the XKCD system the entropy of four words would be 28bits of ebtropy.

But… Every password generated by the small dictionary will also be generated by the large dictionary.

So for the same identicle password one person claims 40bits and another claims 28bits,

So who is correct?

Actually neither, because what if my dictionary only has four words in it? Does the password it generates only have 8bits of entropy?

The important thing to note, is it’s not your dictionary size that matters, it’s the dictionary size of the attacker, because it is they who has to do a brut force or better attack.

Also as they are looking to break a password to get system access, they only have to break the weakest or one of the first paswords in maybe a million entry password list.

But also remember “Rainbow tables” are the equivalent of a simple substitution cipher it’s fairly well accepted that you realy do not need to have all the substitutions mapped out to break a message no matter how large the alphabet size is.

So if your password is in the attackers 1024 entry rainbow table then it’s got less than 10bits of entropy from the attackers perspective. If it’s somebody elses password then it does not matter a jot how many bits of entropy you think your password has it’s irrelevant because he’s “Already in like Flyn” and can potentially replace the password system with their own, that stores all passwords typed in in a hidden file or memory locations, and then sends them back to a server on the Internet…

Passwords are at best a “weakest link” system, and with human failings in the mix those failings are so bad the link could be made of a single strand of cotton thread, you can easily break by hand…

Winter February 3, 2023 11:57 AM

@Clive

Now lets assume for some reason humans don’t like passwords with Q and Z in them…

Utterly irrelevant. If you randomly select 6 items from a collection of 1024 species, you need 6*10 bits to describe them. No compression possible.

Kolmogorov complexity == 60 bits, period.

Winter February 3, 2023 1:15 PM

@Clive

So who is correct?

Not you.

The solution is simple: The Kolmogorov complexity it the minimum number of bits you need to get the string.

In general, you cannot calculate the Kolmogorov complexity [1]. However, in this case it is simple: You need to describe 6 “symbols” who each have a probability of 1/1024 if being placed in each position. That can not be done in less than 60 bits.

There is absolutely no way you can have an algorithm or procedure that can describe this string of 6 items in less than 60 bits. That is by construction.

[1] A single string does not have an “entropy”, but the Kolmogorov complexity is it’s replacement.

Clive Robinson February 3, 2023 7:16 PM

@ Winter,

“Not you.”

Do you actually understand the problem?

I have to bluntly ask it because all your arguments so far fairly clearly say you don’t.

You are blabbing on about how wonderfull the generator is because it has a theoretical password space of 60bits.

I’ve given proof enough that talking about the wonders of the password generator is irrelevant to attacking a database of passwords.

Your retreating behind maths few know anything about,

“The Kolmogorov complexity it the minimum number of bits you need to get the string.”

And again reveal all you are talking about is the irrelevant generator password space,

“You need to describe 6 “symbols” who each have a probability of 1/1024 if being placed in each position.”

That is again the equivalent of saying,

“My dictionary has a thousand entries”

I’ve already shown the dictionary size is irrelevant as long as it is sufficient to hold the six words.

What we are talking about is the resources the attacker needs as a minimum,

So a Dictionary, six words/symbols
That can be in, 6! combinations.

For an attack search space of 720.

MarkH February 3, 2023 7:18 PM

@Winter, Clive:

You both grasp the essential idea that “it’s the attacker entropy that is important.”

For crypto applications, the kind of entropy that usually matters is what’s called “guessing entropy.” This must be defined according to what an attacker is presumed to know. A password has 0 entropy to an attacker who’s seen it in plaintext.

The usual assumption for security purposes — that made by Winter — corresponds to the Kerckhoffs principle: the attacker knows your system (selection at random from a dictionary of 1024 symbols, and even the dictionary itself) but not the specific selections which composed a given password.

By that standard, Winter is correct that each symbol has exactly 10 bits of entropy.

MarkH February 3, 2023 7:39 PM

@gentlemen:

Perhaps a miscommunication has occurred?

If I correctly understood Winter’s proposition, 6 symbols are randomly selected from a set of 1024 symbols, for a total entropy of 60 bits.

If I understood Clive’s most recent comment, it seems to presuppose that those 6 symbols are the only ones the attacker needs to try.

Did I miss something important?

If the attacker doesn’t know the 1024-symbol dictionary, then the guessing cost is even higher. The minimum guessing entropy is that in Winter’s formulation.

MarkH February 3, 2023 7:40 PM

continued:

If the attacker assumes (for example) English language phrases with standard spelling and capitalization, the search space for a 30-character passphrase will be more than 2^170, but the entropy will be nearer 2^75.

The absolute optimal attack is based on possession of the dictionary. With the predicate that the symbols are chosen at random, there is no “faster” sequence in which to try them. For a composition of 6 symbols, guessing entropy is 60 bits, and the mean number of guesses to solve is 2^59.

fib February 3, 2023 7:56 PM

@ Clive Robinson

Re: finding the $5 Wrench xkcd

With DDG there’s this thing called ‘bang’, where you put an exclamation mark to restrict the search to a particular site:

XKCD! $5 wrench attack

Handy in cases like this.

Jordan Brown February 3, 2023 9:12 PM

Are we talking about psychology, or about mathematics?

If we’re talking about psychology, then yes. If I really want my pass phrase to be “only want very tiny pass word”, and I will throw away randomly generated pass phrases until I get that, then yes, I will get a crummy pass phrase. I’ve already told you what pass phrase I want; my choice will have zero bits of entropy.

If we’re talking about psychology then yes, it’s very difficult to estimate the number of bits of entropy.

If we’re talking about mathematics, it’s a different discussion.

So, are we talking about psychology, or about mathematics?

fib February 3, 2023 9:33 PM

@ Jordan Brown, Clive Robinson

A polyglot Bob could add bits of entropy by swapping the words for foreign ones, after first arriving at whatever crazy phrase, like yours

bone dogs only show acts base

osso perros only montrer actum radis

The phrase would still be memorable to him, but an attacker would need six dictionaries now. I wonder if it satisfies Clive’s argument about the attacker’s entropy.

Jordan Brown February 3, 2023 10:20 PM

(Remember a ground rule: the attacker gets to know the procedure that the user uses to pick the pass phase.)

If the user is going to pick one of six languages for the six-word pass phrase, that adds about 2.5 bits of entropy, for a total of ~62.5 bits.

If the user is going to pick one of six languages for each word independently, that’s equivalent to using a 6144-word dictionary, and adds about 2.5 bits per word, for a total of ~75 bits.

ThreeRs February 3, 2023 11:28 PM

@ALL

Lots of interesting comments. It reminded me of Bruce’s recent post (last December 26) on the latest LastPass breach. Bruce pointed to his advice from 2014, which contained a link to a good Ars Technica article about password cracking:

‘https://arstechnica.com/information-technology/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/

That was 2013 so I can imagine things are only faster/better, now. The article did a good job of explaining (for this layman) why a password like Polar_bear64 doesn’t stand up, regardless of how you add up the bit entropy. There are several good promoted comments at the end of the article, especially the one toward the bottom by “gregvp”. Again, this is from 2013.

The blog post, the Ars Technica article and the comments of wiser folks pushed me to finally overcome my inertia and get a password manager (not LastPpass) and generate nice random passwords for the accounts I care about.

@ Regarding the “bone dogs only show acts base” discussion:

Does the idea of a “closed system” come into play? It seems to me that Jordan and Winter are basically working with a closed system, while Clive has opened it up, so to speak. In other words, Yes the 6 word phrase randomly selected is secure from reasonable-time cracking, so long as it considered strictly by itself. Once it is “outside” the system it originated from, the entropy equation changes.

Thanks for helping with my understanding!

Winter February 4, 2023 5:13 AM

@Clive

Do you actually understand the problem?

Yes, but do you?

Calculating the odds of drawing 6 symbols with replacement out of an urn with 1024 symbols is a high school homework exercise. So I assume you do not understand the problem.

Clive Robinson February 4, 2023 8:43 AM

@ MarkH,

Re : The work the attacker does.

“Did I miss something important?”

Yes, you correctly identified,

“This must be defined according to what an attacker is presumed to know.”

You then failed to follow through far enough and stopped with the “brut force search”. Which is the same as “look how wonderfull my generator is” argument.

As I’ve said all along, that is a highwater mark and is irrelevant, it is in that respect the same as the cryptographic “Brut Force Search” and it’s not how systems have been attacked since before “The Great War”.

It’s important to realise that the Brute Force or generator bit size is the “password space size” and for the given generator,

1) It is a quantative not qualative measure.
2) It is allegedly random thus uniform and also compleate for the symbol space.

Then it must by definition have in it’s output passwords that sit uniformly along the scale from very weak to very strong.

The “password strength” is the quanative measure and it is this the attacker is attacking which is why as I’ve continuously said, the nonsense about the generator dictionary size is irrelevant, as it does not significantly effect the “password strength” of “actual passwords used by humans” even though it does significantly effect the theoretical Brut Force search space size.

Also you’ve not followed through that the attacker is,

3) Not attacking one password but many in parallel or efficiently.
4) Will search to find “weak passwords” first.

So not passwords that are truly random, but a whole load of passwords that humain traits have tainted one way or another to make them nearly all from the “weak password” or “very weak pasword” groups.

Why because we’ve know since the 1960’s or earlier, that when humans select passwords by far the majority are not even close to strong passwords, some are the weakest of the weak. They are realy selected on “ease of remembering” with some times when arms are twisted via automated rules[1] maybe one or two easy to remember things tacked on.

It’s why I pointed out above about Polar_bear64, being so weak.

1) It’s not realy two words, but one.
2) Capitalised as normally expected (at the front of a sentance or proper noun).
3) The word seperator was one of the two most likely (hyphen or underscore).
4) The 64 was one of ten easy to remember double digit numbers (as it’s –8×8– a squared number).
5) The number was in the most obvious of two places.

Each and every one of those “helps me to remember” rules and many more guts, the size of the “Generator password space” (thus the password entropy) down enormously. Worse they only select weak or very weak passwords. So effectively guarentee that given say 100 passwords by far the majority will be entirely predictable.

These are things “Attackers Know” about the “human system” and it is these they are attacking not the generator Brut Force space size.

It’s why the password cracking systems work as easily and quickly as they do because the two teams in this game are,

“Poor human memory -v- Automated smart thinking”

Worse the attackers can use,

1) Efficiency measures.
2) Iterative learning measures.

The most obvious efficiencies being “Rainbow tables” or the equivelent optimized for “Standard OS Systems” and multiple CPU Core threads each attacking individual passwords in parallel.

Security wise the “iterative learning is the biggest security risk and will become more so when Password Cracking starts using “big AI”.

Every password cracked makes cracking other passwords easier as you can start to see trends. But also once in a system an attacker can grab the “plaintext” passwords so grab more information. In a way this is making systems into “known password” crackers. But each password that becomes known adds to the knowledge of “human traits” so makes the very weak and weak password groups easier to attack as peoples “help me to remember” rules get found. It is this latter issue of finding the human trait rules that “Big AI” will excell at and thus all passwords will slide down the scale into the very weak and weak groups from the attackers perspective.

There’s nothing new in what I’ve said, and I thought it was not just obvious but well known. Turns out I was only part right…

[1] System administrators are human as well, and do not like making rods for their own backs, thus the gamble with system security. One aspect of this is the enforced “password rules” are designed,

1, To be easy for the computer to police.
2, Just enough to move the passwords from the bottom end of the very weak group.

This way they reduce the “annoyance factor” of “user bleating” in their work load. Yes even SysAdmins get to feel the pain of that bleating through managment, who feel it from frontline support and it’s “sunk costs”.

Clive Robinson February 4, 2023 10:01 AM

@ fib,

Thanks for the DDG tip. At some point I’m going to have to find them all and work out the how and why of them (most modern search engines don’t fit in with traditional text search queries).

However the point I made about all the crypt-bros and their “$5 Wrench” stories, is something I’m casually looking further into.

It is –if true– the old game of “Street mugging” effectively moved from the all physical space of “money or your life” highway robbery through “yer money or I sticks yer” street crime of the 1990’s and into the later stealing information space taking of Credit/Debit cards and demanding the PIN, so the muggers could “cash out” quickly at ATM’s and then hit the shops the following morning knowing that the victim would not have had time to cancel the carfs and the card companies sufficient time to warn off-line merchants.

Now the only physical part is the actual “stick em up” is to get the users electronic wallet and passphrase.

Kind of “evolution in action”.

MarkH February 4, 2023 11:59 AM

@Clive:

I was not addressing how people usually devise passwords, but a hypothetical strong method with 60 bits of entropy, having the advantage that some might be able to remember a 6-word nonsense phrase.

Such a generator might make a tiny proportion of weaker passwords like “my old dog has no fleas,” which could be mitigated.

“Rainbow tables” are not magic, but one of many time/memory trade-off tools.

For the hypothetical 6 x 1024 generator, a table of hashes would occupy dozens of exabytes — around what Google is estimated to use for its entire suite of data-driven services … if hashing is done with a fixed salt.

With more sophisticated salting, the “rainbow tables” for a specific 1024-word dictionary could exceed the storage capacity of all mass storage drives now in existence.

Jordan Brown February 4, 2023 2:03 PM

Clive…

So it sounds like we’re talking about psychology. I agree. Humans are simply awful at creating passwords. If you want to get the value from a generator, you have to just take what it gives you and go with it. If anything, you’d want to filter out those pass phrases that happen to form reasonable sentences, because as you say those are the ones that attackers will try first.

Though I have partially written a generator that generates more-or-less-correct-grammar sentences. It takes more words, and longer words, to get a given level of entropy, because there just aren’t enough adjectives, verbs, et cetera. That generator might produce, for instance, “cursed excess safely rebuke loose virtue”, using a pattern of adjective, noun, adverb, verb, adjective, noun. For my current word set, those words contribute 10, 10, 7, 10, 10, and 10 bits, for a total of 57. (The number 10 shows up a lot because these are frequency-ordered lists and I use only the 1024 most common words in each category.) Here’s another, using the pattern {name, verb, adjective, noun, conjunction, adjective, noun}: “Matteo urging small leg till uneasy dream”. That’s 10, 10, 10, 10, 4, 10, 10, for a total of 64 bits. But, again, you have to use what the generator gives you – you can’t say “but that sentence doesn’t mean anything” and pick another.

ResearcherZero February 5, 2023 3:49 AM

@Winter

In Cinderella City, everything, including fallout, takes a little longer. Though they are beginning to frown on everyone turning up for work drunk or high, beating people to death at work, and are even cracking down on false invoices. Finally things are beginning to change.

https://audit.wa.gov.au/wp-content/uploads/2022/03/Report-13_-Information-Systems-Audit-Report-2022-State-Government-Entities.pdf

If anyone fancies a challenge:

WA Communities CIO

“The [CIO] is responsible for ensuring all information systems, communications, technology, knowledge management and services align with the departmental outcomes and whole-of-government reform agenda.”
https://www.linkedin.com/jobs/view/3430570070/

https://audit.wa.gov.au/wp-content/uploads/2022/11/Report-8_Forensic-Audit-Results-2022.pdf

“By developing a proposal for an overarching emergency response, the state government is ensuring that cyber security is a collective responsibility and we are prepared to mitigate the impact of any large-scale attacks.”
https://www.mediastatements.wa.gov.au/Pages/McGowan/2023/01/Defending-Western-Australia-against-cyber-crime.aspx

Clive Robinson February 5, 2023 7:07 AM

@ Jorden Brown,

Re : Humans as the filter.

“If you want to get the value from a generator, you have to just take what it gives you and go with it.”

Yes, but next to nobody does, they either keep clicking or rearange the words or just change one or two etc. Because in their view

“What harm? it’s still six words.”

The other problem is as I said the dictionary and it’s subsets. The Oxford English Dictionary has so many words –I remember being told 17,000– most of which you won’t know or be able to spell.

Your 1024 word dictionary can thus only generate a subset of those but all the 2^60 passwords your generator can create will be in the output of the OED generator. Likewise a 128word dictionary generator will generate a subset of your 1024 word dictionary if all it’s words are a subset of your dictionary.

Thus the point about the six word dictionary, will generate 720 passwords all of them a subset of your dictionary, of which one would match your password.

Hence the size of the dictionary is another quantity-v-quality issue. If my smaller dictionary has no words in common with your larger dictionary then it cannot generate any of your passwords…

Attackers are going after low hanging fruit on a big tree of fruit. So pick your dictionary words with care and the attackers won’t gain any advantage over full Brut Force search character by character of your generated password.

Which brings us onto,

“If anything, you’d want to filter out those pass phrases that happen to form reasonable sentences,”

Begs the question “How?” and the answer is far from obvious or known.

To a computer,

“The cat sat on the mat”

Is just six words… The fact that just about everybody who speaks English has not just heard it but knows it by heart, means it’s a very high probability of being in the majority of attackers consideration. Likewise,

“The owl and the pussy cat”

Though both at a lower search priority than you might expect, ‘feal the force Luke’ and similar in Klingon would be higher based on the vast number of known password databases.

And that’s the attackers mind set. Each time you get a new known password you run it against the unbroken passwords you have and see how many matches you get. This in effect becomes it’s “fitness number” by which it gets judged in one way. But if it’s from a “new work” say a new, blockbuster SiFi movie then speculatively you might try all the catch phrases from the movie (The film the Martian was kind of a hit in this respect with “science the s..t out of it” being quite popular for obvious reasons for a while thus made it into passphrases of even non geeks/nerds).

But the big problem is,

1, Generator outputs a pass phrase
2, A user the “filters” to more often than not a weak passphrase.
3, The user makes it live on a service.
4, The Passphrase file/database of thousands or more users of a service becomes known to outside attackers.
5, The attackers find the first weak or very weak password and “They are in like Flyn”.

It’s game over at that point.

The problem is stage 2 and it should be fixed by the service at stage 3, but this does not happen. The reasons boil down to cost. Higher security means more upset users sounding off to first line support, which costs the service provider money they see as “sunk cost” so they want to eliminate it.

Thus the rules eliminate what would be very weak passwords. Unfortunately the law of unintended consequences applies and the same rules eliminate plenty of strong and very strong passwords as well…

So you end up with a bunch of weak passwords that get through the rules that many realy are of no real worth, but look good on paper so meet auditors checkbox lists…

@ All,

Anyway I’m currently in hospital, in what they call Resuscitation having experienced a significant cardiac event –yet again– and I am expected to be in for two or more days whilst they adjust the little tomb stones I swallow twice a day. The problem… some hospitals get touchy about pluging in chargers, and this phone is so old it’s battery needs more life support than I do. So I might take a while to get back to people.

Jordan Brown February 5, 2023 12:39 PM

Having acknowledged that if humans don’t follow the rules, bad things happen, and so now we can mostly just talk about mathematics…

Yes, the possible outputs from my 1024-word dictionary are a subset of the outputs from an OED-based 600,000-word dictionary. And yes, the possible outputs from a 128-word dictionary are a subset of the possible outputs from my 1024-word dictionary. (Remember the base rule: the bad guys get to know the procedure, including the dictionary used.)

This means two things:

(1) An attacker who uses the OED can crack all of my passwords. Well, yeah. Of course, they’re searching a 115-bit space, so if they have a billion (1e9) computers each checking a billion passwords per second, it’ll take them about a billion and a half years.

(2) An attacker who uses a smaller dictionary can crack some of my passwords. Sure. They could get lucky. But then again, they could pick one password, and they might get lucky. That’s true no matter how large your password (or key) space is – the bad guy might choose the one-in-2^60 (or one-in-2^256 or one-in-2^4096) value that happens to be the right one. It’s true whether you’re talking about pass phrase generators, or password generators, or AES key generators. But an attacker using a 128-word dictionary will only search about 0.0004% of my possible passwords, so has a 99.9996% chance of not finding my password.

Thus the point about the six word dictionary, will generate 720 passwords
all of them a subset of your dictionary, of which one would match your
password.

Huh? I don’t know where 720 came from, but while the output from a small dictionary could match my password, that’s a very different thing from saying that it would match my password.

If my smaller dictionary has no words in common with your larger
dictionary then it cannot generate any of your passwords…

One of the base rules is that you get to know my dictionary, just as for a password you get to know ASCII (or Unicode).

So pick your dictionary words with care and the attackers won’t gain any
advantage over full Brut Force search character by character of your
generated password.

The attacker has an advantage over a full character-by-character brute force search. Absolutely! My passwords only have 2.5 bits per character of randomness, versus ~6.5 for ASCII and even ~4.7 for lowercase-only. That’s why I need 24 letters to get as much randomness as a 9-character random ASCII password. But I think that “wary tear vase make from fell” is easier to remember and type than “-5$AZr;4a”. For either one of them, the attacker will need to try about 1e18, a billion billion, samples to be sure of hitting the right one.

How would you filter out phrases that form sentences? We were talking about humans filtering, and if you’re going to let the human filter then the human can filter out things that look like sentences. I wouldn’t care to estimate what fraction of the 1e18 pass phrases that would exclude, but it’s a very small fraction. But let’s guess ridiculously high and say that it’s half, that half of all generated phrases look like sentences and so you reject them. That’s still 59 bits of entropy.

2, A user the “filters” to more often than not a weak passphrase.

Yes, agreed. If the human filters for something that looks “better”, that looks more like a word or more like a sentence, they make it significantly worse. So don’t do that. Don’t be low-hanging fruit.

4, The Passphrase file/database of thousands or more users of
a service becomes known to outside attackers.
5, The attackers find the first weak or very weak password and
“They are in like Flyn”.
It’s game over at that point.

It’s game over for them. Not for me. I don’t care whether the bad guys figure out your password. (As long as you’re not an administrator.)

So you end up with a bunch of weak passwords that get through the rules
that many realy are of no real worth, but look good on paper so meet
auditors checkbox lists…

Yep. Most password rules are useless, or worse.

Anyway I’m currently in hospital,

Best wishes.

Sumadelet February 6, 2023 6:09 AM

@Clive,

keep beating the odds, Clive.

With all this talk of dictionaries, and password generators, I’m surprised no-one has mentioned Diceware. Any comments?

Jordan Brown February 6, 2023 10:27 AM

Diceware is the same concept, though since they use a 7776-word dictionary they get ~13 bits per word, versus the 10 bits I get.

I didn’t like that their list includes constructs like “a&p” and “a’s”, and non-words like “aaa” and “bf”. Having close variations like “abbot” and “abbott” and “bib” and “bibb” seems like an opportunity for error.

And of course their scheme of using dice is quite clever and saves you from any possible issues with your random-number generator.

Clive Robinson February 7, 2023 1:42 AM

@ ALL,

It’s a little before 05:30 in the UK and “the nurses” are a little busy elsewhere, so I get the chance to escape the confines of the bed, all be it only virtually. So a quick nip round the back of the school bike sheds as it were for some illocit behaviour 😉

Thanks to all for the well wishes.

@ Jordan Brown,

“Having acknowledged that if humans don’t follow the rules, bad things happen, and so now we can mostly just talk about mathematics…”

If only… Think more about a path that splits into two to get around an obstacle, to join up again.

As you point out,

“Of course, they’re searching a 115-bit space”

Yes “they are both” that is the User and the Attacker are looking for a diamond in a huge dung heap of bits.

The Uset wants to find “easy to remember” and the Attacker “easy to discover”.

Similar but not the same. Because “the space they are playing in is different.

The user is presented with a vast uniform probability space, of randomness. That has downside of,

“The bigger it gets proportianately the less diamonds it has.”

That is we know that there are only a small and very finite effectively fixed number of rememberable pass phrases. Making the “random space” bigget where there is,

“No rememberable meaning”

In no way makes things more secure, it just makes the user more frustrated. Effectively to the User you are just adding what appears to be an infinite amount of dung to the heap, without increasing the number of diamonds.

You are to borrow a phrase fron the crypto-coin-bros,

“Making the work factor harder”

To a power of two with every bit your generator increases the “random space” it generates.

In short the generator designers are “doing the wrong thing” and actually making things less secure in the process.

To be more secure the generator needs to generate “signal not noise” the signal being the diamonds of “rememberable phrases, the noise the vast dung of “random space”.

So… The generator needs to do what the human does, which is,

“Turn muck to brass”

And,

“compress the dung to diamonds”.

That is the generator currently gives

“way broom door glue train knit”

Or a lot worse, and the user wants,

“the cat sat on the mat”

In reality nomatter how often the user clicks the “make a phrase” button, the chance of them getting one that is memorable for them in their life time is as close to zero as makes no difference.

That is the real problem with by fat the majority of password generators,

“They make 5h1t not diamonds”

And leave the hard work to the user who is effectively incapable of doing it without a “force multiplier”.

So the generator designers need to stop doing a realy “half a55 job” and “earn their corn” by making what the User wants them to which is give a diamond at every press of the button.

The reason they have not is the designers see it as a “reversing a one way function” task.

Imagine it this way. You are given a very very long effectively infinate –in human terms– list of what appear to be compleatly random bit strings. All you know is that some fixed number of them when put through a Crypto-Secure hash will give a comprehensible and memorable sentance. Your job is to select from that list ONLY the bit strings that will make those infinitely rare “comprehensible and memarable” sentances…

This is actually a way way harder job than doing what the attacker has to do, because his “list” is all bit strings that will produce”comprehensible and memorable” sentances, he just has to reverse that hash, for which the attacker has way way better than brut force algorithms.

Winter February 7, 2023 4:03 AM

@Clive

Thanks to all for the well wishes.

Get well soon, and strength in the mean time.

(Did not get your situation earlier)

JonKnowsNothing February 7, 2023 8:19 AM

@Clive, All

Hope things are going well or better. Glad you were able to get a re-charge!

On one trip of mine, the staff put my phone in one locker and the charger in a different one :-O

re: In reality no matter how often the user clicks the “make a phrase” button, the chance of them getting one that is memorable for them in their life time is as close to zero as makes no difference.

You can see this in action in many games. Most games are based in some sort of Fantasy World. Within that context there may be languages or speech types that are part of the cannon. (Think: Tolkien with Elves, Dwarves, Men, Monsters and their related languages and heritages.)

When you roll a character, you select a name for it. There are often suggestion guides as to what type of spelling is suitable. There is also a random name generator that, in theory, follows the guidelines for suitable cannon correct naming.

You can click a long time, tying to get the generator to pop up a name that is memorable, and easy spell.

Rename Tokens are good revenue streams for game companies.

Clive Robinson February 7, 2023 9:12 AM

@ JonKnowsNothing, ALL,

I’ve managed to do a virtual “sneak out to the back of the bike shed” as even hospital staff have to have a lunch time on a 12hour shift 😉

“You can click a long time, tying to get the generator to pop up a name that is memorable, and easy spell.

Rename Tokens are good revenue streams for game companies.”

Now there is an idea for an enterprising individual.

Essentially in throry it’s easy, you are just remapping totally random output from the generator into a “just as random” sensible “passphrase”, even though the passphrase data set will be appreciably smaller.

The problem, an algorithmic way of constructing the passphrases.

That’s almost a candidate for a “rule based” system that knows language and grammar.

JonKnowsNothing February 9, 2023 8:49 PM

@Clive, All

re: Dice rolls by Jumping Beans

A fun MSM article about Mexican Jumping Beans which have a small moth larvae inside. When the bean falls onto the ground the larvae makes the bean jump as it tries to move the bean into a shady spot where the larvae can mature.

According the report, the larvae use a Random Walk method to bump their way towards potential shade. Each hop independent of the previous hop.

“These results suggest that diffusive motion [random walks] in Mexican jumping beans does not optimize for finding shade quickly. Rather, Mexican jumping beans use a strategy that minimizes the chances of never finding shade when shade is sparse.”

I think this has great potential for a RNG application! Although it is subject to bias by temperature increasing or decreasing the number of hops. It may be more fun to have a pocket of jumping beans than a sack of multi-sided gaming dice.

===

ht tps://arstechnica.c om/science/2023/02/taking-a-walk-on-the-random-side-helps-mexican-jumping-beans-find-shade/

h ttps://arstechnica. com/science/2023/02/taking-a-walk-on-the-random-side-helps-mexican-jumping-beans-find-shade/2/

ht tps://cdn.arstechnica.n e t/wp-content/uploads/2023/02/jump2.jpg

  • graphic of the bean’s trajectory over time.

ht tps://en.wikipedia.or g/wiki/Random_walk

  • In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space.

ht tps://en.wikipedia.o rg/wiki/Isohedral_figure

  • multi-sided dice

(urls fractured)

Clive Robinson February 10, 2023 2:23 AM

@ JonKnowsNothing, ALL,

Re : Jumping Beans

It’s actually not a “random walk” it just looks like it (much as particles in a working fluid do under Brownian Motion).

The lava, just knows it’s too hot, so flicks it’s tail, the bean is far from a traditional bean shape more like a segment of orange thus the way it moves in response to the lava is a bit curious

If you think about it not moving the beans position but simply rotating it can increase or decrease the amount of “solar irradiance” it gets thus the temprature of the bean the lava is responding to.

It’s a bit like “flipping coins is random” till someone comes up with a machine that shows it’s actually not, but the flipper of the coin having lack of control (something magicians knew befor physicists 😉

The sad thing is that Jumping Beans were once a popular toy and I had some when young. They work by slowly tourturing the lava to death… Not such a good toy when viewed in that light.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.