If you’re running a web service where your users have to authenticate, one of your options will probably be the classic password authentication. The user provides a term that uniquely identifies it, probably a user name or an email address, and a secret only known by the two involved parties, your service and the user. By presenting the correct secret both parties agreed upon when the user account was created, the user can prove his identity to the web service.
Plain Text Storage, or: I’m feeling lucky
The naive implementation would be to save the secret password as it is in the data store of your web service. In an ideal world this would be enough from a security standpoint. However, if your service gets compromised through i.e. a simple SQL injection, an attacker can obtain the users‘ plain text passwords. This certainly is not desirable because most users not only still use the same password on multiple web sites but also user names and email addresses to identify them. An attacker could easily automate signing in to other services with the obtained credentials to check for matches.
Unfortunately, some site still save plaintext passwords. If you have forgotten a password and the web service emails you your old password, you should stop using it as they have no clue about basic security! If you are one of those that still reuse passwords, get yourself a password manager right now! I personally prefer something simple like pass, but any open source password manager like KeePassX will do.
Encrypting Passwords, or: Isn’t crypto always good?
As we don’t want to store plain text passwords, one option would be to simply encrypt them with a standard block cipher algorithm like AES or Blowfish. In that case the application has to keep an encryption key that is either static or dynamic and saved alongside the encrypted password.
But due to the nature of encryption, the operation is reversible in order to decrypt the password with the correct key. Because the key has to be stored in the clear, passwords can be obtained quite easily when the application has been compromised, because the attacker is able to decrypt the passwords with the key. So this isn’t a good solution, either.
Cryptographic Hashing, or: We need moar of this crypto stuff!
When we dig deeper into the cryptographic toolbox, we find cryptographic hash functions that transform a string of arbitrary length into one of fixed length, called a hash sum or simply hash. Two important properties of cryptographic hash functions are irreversibility and determinism, which is exactly what we need.
Because hash functions are deterministic, thus yielding the same output for equal inputs, instead of comparing plain text password the application can compare their respective hashes. This way only the hash needs to be saved. Theoretically, two different passwords could result in the same hash, but hash functions provide a varying probability of hash collisions which is negligible for modern hash functions.
Commonly used cryptographic hash function include SHA-256, SHA-512, SHA-3, Skein or Whirlpool. Please note that I didn’t mention deprecated algorithms like MD5 for a reason. They have been horribly broken for years. You should not ever use them!
Due to irreversibility, even if the attacker has obtained a hashed password, he is not able to compute the plain text version. For this reason, rainbow tables were developed where the attacker basically trades disk space for computing resources. Rainbow tables are optimized for quick lookups and contain pre-computed hashes for common passwords or even all combinations of characters of given lengths. Typical rainbow tables take dozens of terabytes of storage.
So if your users have chosen easy passwords, which in this case means less than 10 characters or only alphanumeric, they are still at risk.
Adding Salt and Pepper
But that’s only the beginning. It becomes feasible to compute very big rainbow tables because given the same hash functions, one rainbow table can be used for a number of sites the attacker has compromised. The solution is to append a random string, called salt, to the password before hashing it. This salt should be different for every user and can safely be stored alongside the password hash in the data store.
Another precaution is to append an application-specific secret, commonly called pepper, so that access to the data store is not enough to generate useful rainbow tables.
Your spices (salt and pepper) should not only contain alphanumeric characters but cover the whole range of possible byte values.
The solution: Password-based Key Derivation Functions (PBKDF)
Classic cryptographic hash functions are generally used to validate messages for integrity. That means to detect if the key of a decrypted message is indeed correct or if the decrypted message is garbage. As such, cryptographic hash functions are generally optimized for efficiency which, however, is not desirable for our use case. Furthermore, cryptographic hash functions tend to live on very long while the affordable computing power rises exponentially.
PBKDFs are hash functions for passwords that are designed to work in iterations. This acts as a kind of work factor to let you specify how much computing resources you are willing to sacrifice in order to make it harder for an attacker to pre-compute hashes. Generally, the number of logins, signups and password changes are negligible compared to other requests in your web service, so you should be generous when choosing the number of rounds of your PBKDF.
Additionally, PBKDFs must also be parametrized with a salt. Please note that as mentioned above, the salt should be sufficiently random and not the same for all users!
You should use a modern PBKDF, like bcrypt or scrypt, with salt and pepper. Implementations for all major programming languages exist. You should experiment with the number of iterations to suit your hardware inventory and paranoia level.
Since we’re a PHP shop, we should not forget to mention that beginning with version 5.5, PHP has the self-explanatory functions password_hash() and password_verify() that use bcrypt as default. Please use them. Note that password_hash() already generates a secure salt for you. Don’t generate one yourself.