Does A Breached Password Lookup Reveal My Password?

DAK · July 10, 2025

I had a discussion yesterday with an acquaintance about some new infostealer leaks; I was talking about verifying whether the credentials are new or not (which was a silly thing to do, I should have known they weren’t in HIBP — for different reasons though) and I went to check if some of the passwords were contained in the HIBP corpus. The acquiantance asked something to the effect of, “why would you put the password into a web form, isn’t that leaking it further?”. This naturally reveals a common misconception regarding how breached password lookups typically work; both in HIBP itself, and competing commercial breached corpuses.

What’s a breached password corpus?

A “breached password corpus”, whether Troy Hunt’s Have I Been Pwned or a commercial alternative such as Specops Software’s Breached Password Protection are large databases of passwords (and other data such as emails, in the case of HIBP) that have been exposed in various data breaches and credential leaks. These databases are then used to check if a given password has been leaked in the past, and might be at risk of being compromised due to re-use, or a threat actor using a leak as a list of potential passwords for brute-force attacks with tools like Medusa. This allows an organization to ensure that should credentials used for accounts in their environment get leaked whether due to re-use, or explicitly the corporate account getting leaked due to an infostealer or ransomware attack, any accounts using this password are immediately flagged for a password reset, invalidating these credentials and preventing the account from being accessed.

In the case of a consumer using a breached corpus such as Have I Been Pwned (HIBP) this serves a similar function, but because the average user doesn’t have an Active Directory or LDAP configured, the reset becomes manual, rather than an automatic process provided by a password filter on the DC. A user would monitor their accounts or passwords and manually rotate the credentials if they see they’ve been involved in a data breach.

Doesn’t the lookup reveal my password?

This is where the confusion occurred in my discussion with a buddy about new infostealer data and verifying whether HIBP contained the credentials. Typically a breached corpus will perform the lookup using something called k-anonymity. You can review the Wikipedia article for the computer science definition, but I’ll explain it in plain english here, if you don’t have a BSc in CS XD.

So the way it works in the case of something like HIBP, is the databse of passwords isn’t in plaintext, it’s stored as a hash (the same style of hash you would use to actually implement the password for authentication). We’ll use SHA1 as an example.

When you perform the lookup of a specific record, the candidate you’re performing the lookup with will be hashed on your side, on the client. For example SHA1(password) = 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8. The leading first handful of characters in this hash is used to perform the lookup rather than the password itself. We’ll use 4 characters as the example here, but it varies based on the implementation for any number of reasons, such as increasing or decreasing the number of matches returned by the corpus when looking up a given password. So the system would then complete an API request to get the list of all candidates matching 5baa; this is returned as a list of possible matches.

Using this list of possible matches, the client-side implementation then searches this list for the matching hash; if the full SHA1 hash is present in the list (5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8), we know the password has been leaked in some databreach and the password should be rotated for security. If the whole hash is not present, we know the password hasn’t been in a databreach (or various other leak) and we simply throw the data away and it’s all good.

The only information that is involved in the request is the first handful of characters of the password hash, the actual password never transits and stays on the client or user environment.

So long as you use a service you trust, there is no risk in using such a service, it only provides positives in protecting you from your credentials from being used by a bad actor. There’s a reason NIST 800-63b explicitly requires the use of a breached corpus; they are the experts after all.

What about the list? Doesn’t that reveal my password?

By manipulating the number of characters that is used for the lookup, the implementation can change how many records are contained in the list of candidate hashes. In the unlikely event that an attacker managed to Man In The Middle (MITM) the request and get the list of candidate hashes, an attacker would need to crack all of the records in the API response in order to get any of the potential candidates. Recall that this list, given good security hygiene, is unlikely to contain the actual password; wasting a great deal of time, effort, and computational power in the hopes of one of the records being correct. There are easier ways to get access to the password for an account nowadays, especially with the prevalence of infostealers.

Should I use a breached corpus

Yeah, 100%, whether it’s a paid service or simply HIBP. It is worth stressing that the presence of a password in a breached corpus is a guarantee that it has been breached in some leak, and the credentials need to be rotated. However, the opposite is not true; it is not guaranteed that because a password isn’t in HIBP (or another database) the password has not been somehow leaked. There is lag-time in gaining access to the data, processing it, and including it in a dataset for use. A breached corpus does not replace the need for good security hygiene, implementing MFA, and following NIST 800-63b when generating passwords; there is no silver bullet.

For example, from yesterday’s redline infostealer upload:

Password
354727Ludmila
francerito2022
Williamstev1
Adilst@230
Hydreigon06.

A sample of 5 randomly pulled passwords that show as not leaked in HIBP; as often is in the case such as the alleged 16b record infostealer data that I previously wrote about. Not finding a credential in a breached corpus doesn’t mean it’s secure, it just means if it has been leaked, it just hasn’t been included yet.

The post in question:

RedCloud ULP

Conclusion

The use of a breached corpus such as HIBP is simply a part of the whole pie of best practise security controls that should be implemented in order to protect your data and yourself or your users. One should strive to completely implement NIST 800-63b as is possible for the given situation. Like anything else it helps provide security in layers; you’re forcing a bad actor to work through all the layers of the ogre, from MFA, to breached credential alerts, password complexity making it difficult to impossible to crack (nothing’s impossible to crack, but you get the idea); feel free to check breached password services with wanton abandon if you’re worried you may have been in a leak, it will not weaken your security posture and only serves to provide as a warning for when you need to reset it.

Twitter, Facebook