Recently, a well-known software and security services company posted a blog post titled “Password encryption: What is it and how does it work?” which immediately raises a great deal of discussion points as well as questions. So I figured I’d do a short writeup on the difference between encryption and hashing, and why we do not encrypt passwords; and why it’s dangerous to consider encrypting secrets, rather than hashing them.
Encryption
Encryption is, by definition, a reversible process, regardless of whether it’s symmetric or asymmetric encryption. In order to be useful, encryption must be able to be reversed. If a given person, say Bob, in the case of symmetric encryption using a cryptographic key (see: diffie-hehellman key exchange, etc).
Hashing
Hashing is a one-way process, and is explicitly deterministic. That is, for a given input, it must be possible to receive the same hash for the same data entering the algorithm, regardless of which hashing algorithm is being used. This may take the form of a CRC, which you may have seen as a hash that you can use to verify a download has not been modified or tampered with. This is also, as is obvious given this topic, used to hash passwords for secure storage. This allows a system to compare a password securely without comparing the original plaintext. This often takes the form of, at a basic level:
if(md5(password + salt) = user(id).password_hash) {
allow();
}
else() {
disallow();
}
That is, hashing allows a system to compare a salted secure secret against a given input, without storing the secret in some reversible format; in order to determine the plaintext that went into the hashing algorithm, you must find the correct input exhaustively. This takes the form of password cracking using things like word lists, rules, and worst-case brute force. Not Rainbow Tables though, so help me god.
Retrieving the plaintext out of a hash can be considered equivalent to retrieving the individual ingredients from a smoothie once it’s gone through the blender. You need to exhaustively find the components via experimentation (password cracking) — you cannot simply retrieve the original ingredients, they’re gone now.
Encryption is a bag of trailmix, hashing is a smoothie.
Reversible encryption risks for secrets
The primary issue with discussing encrypting secrets rather than hashing them, is that it becomes possible for a bad actor to get, or bruteforce the key that was used to encrypt this data. This can take the form of a bad actor harvesting the key off some machine, cloud bucket, or some other storage medium, or simply bruteforcing it in the case of a weak implementation (usually the former though).
This is a problem because it allows a bad actor to immediately determine the plaintext of every given secret rather than needing to exhaustively determine the plain for each given hash via cracking. If the secret used to encrypt the data ever gets discovered or leaked, you have no security.
In the case of hashing, there is no secret to leak; there are the individual secrets, which as we’ve seen can be leaked through various methods like stealer malware— this is why it’s so crucial to never re-use passwords for various services. However, each individual secret remains strong, an attacker must crack each specific hash, costing time, compute resources, and so on. It is not possible to simply reveal all secrets given some known value.
Specificity is important
When writing security guidance for various organizations and users to consume, specificity is important. An end user may not know the difference, and might make misguided decisions based on partial information.