Authentication Best Practices: Password Hashing and Storage
In 2012, LinkedIn suffered a breach that exposed 6.5 million password hashes. Because they used unsalted SHA-1, attackers cracked 90% of them within days. The 2013 Adobe breach was worse: 153 million...
Key Insights
- Modern password hashing algorithms like bcrypt and Argon2id are specifically designed to be slow, making brute-force attacks computationally expensive—this is a feature, not a bug.
- Salts must be unique per password and cryptographically random; modern libraries handle this automatically, so use them instead of rolling your own implementation.
- Password hash migration should happen transparently during login, allowing you to upgrade security without forcing mass password resets.
Introduction: Why Password Security Matters
In 2012, LinkedIn suffered a breach that exposed 6.5 million password hashes. Because they used unsalted SHA-1, attackers cracked 90% of them within days. The 2013 Adobe breach was worse: 153 million passwords stored with reversible encryption and identical passwords producing identical ciphertexts, making pattern analysis trivial.
These aren’t ancient history lessons. Credential stuffing attacks—where attackers use leaked username/password pairs against other services—account for billions of login attempts daily. When users reuse passwords (and they do), your weak password storage becomes everyone’s problem.
Getting password security right isn’t optional. It’s table stakes for any application handling user authentication.
The Cardinal Rule: Never Store Plain Text
This should be obvious, but breaches continue to prove otherwise. Plain text password storage means a single database compromise exposes every user credential instantly. Reversible encryption isn’t much better—if your application can decrypt passwords, so can an attacker who compromises your system.
Here’s what you should never do:
# NEVER DO THIS - Plain text storage
def create_user(username: str, password: str):
cursor.execute(
"INSERT INTO users (username, password) VALUES (?, ?)",
(username, password)
)
# NEVER DO THIS - Reversible encryption
from cryptography.fernet import Fernet
def create_user(username: str, password: str):
key = load_encryption_key() # If attackers get the DB, they'll get this too
encrypted = Fernet(key).encrypt(password.encode())
cursor.execute(
"INSERT INTO users (username, password) VALUES (?, ?)",
(username, encrypted)
)
Both approaches fail the fundamental test: if an attacker gains database access, they shouldn’t be able to recover passwords. Period.
Understanding Hashing vs. Encryption
Encryption is a two-way operation. You encrypt data with a key and decrypt it with the same (or related) key. This is appropriate for data you need to read later—credit card numbers for recurring billing, for example.
Hashing is a one-way operation. You transform input into a fixed-size output, and there’s no mathematical way to reverse it. You can only verify by hashing the same input and comparing outputs.
import hashlib
# Hashing is one-way
password = "mysecretpassword"
hashed = hashlib.sha256(password.encode()).hexdigest()
print(f"Hash: {hashed}")
# Output: Hash: 0a7d8c3b2e1f... (64 hex characters)
# There's no unhash() function - you can only verify
def verify(input_password: str, stored_hash: str) -> bool:
return hashlib.sha256(input_password.encode()).hexdigest() == stored_hash
For passwords, you never need the original value. You only need to verify that a user-provided password matches what they registered with. This makes hashing the correct choice.
However, not all hash functions are suitable for passwords. General-purpose hashes like SHA-256 are designed to be fast—great for file integrity checks, terrible for passwords. An attacker with a modern GPU can compute billions of SHA-256 hashes per second.
Choosing the Right Algorithm: bcrypt, Argon2, and scrypt
Password hashing algorithms are intentionally slow and resource-intensive. This asymmetry is the key: legitimate authentication happens once per login (milliseconds are fine), but attackers need to compute millions of hashes (slowness kills them).
bcrypt has been the gold standard since 1999. It includes a configurable work factor that increases computational cost exponentially. It’s battle-tested and available in every language.
Argon2 won the Password Hashing Competition in 2015. Argon2id (the recommended variant) is both memory-hard and resistant to side-channel attacks. It’s the modern choice for new applications.
scrypt is memory-hard like Argon2 but predates it. It’s solid but Argon2 is generally preferred for new implementations.
MD5, SHA-1, SHA-256: Never use these for passwords. They’re too fast and weren’t designed for this purpose.
Here’s bcrypt in both Python and Node.js:
# Python with bcrypt
import bcrypt
def hash_password(password: str, rounds: int = 12) -> bytes:
"""
Hash a password with bcrypt.
rounds=12 means 2^12 iterations (~250ms on modern hardware)
Increase rounds as hardware gets faster.
"""
salt = bcrypt.gensalt(rounds=rounds)
return bcrypt.hashpw(password.encode(), salt)
def verify_password(password: str, hashed: bytes) -> bool:
return bcrypt.checkpw(password.encode(), hashed)
# Usage
hashed = hash_password("user_password")
# Store 'hashed' in database (it's bytes, encode as string if needed)
is_valid = verify_password("user_password", hashed) # True
// Node.js with bcrypt
const bcrypt = require('bcrypt');
async function hashPassword(password, rounds = 12) {
// rounds=12 provides good security/performance balance
// Each increment doubles the computation time
return await bcrypt.hash(password, rounds);
}
async function verifyPassword(password, hash) {
return await bcrypt.compare(password, hash);
}
// Usage
const hash = await hashPassword('user_password');
// Store hash in database
const isValid = await verifyPassword('user_password', hash); // true
For Argon2id, the recommended parameters from OWASP are: memory cost of 19 MiB, iteration count of 2, and parallelism of 1:
# Python with argon2-cffi
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
# Configure with OWASP-recommended parameters
ph = PasswordHasher(
time_cost=2, # Number of iterations
memory_cost=19456, # 19 MiB in KiB
parallelism=1, # Threads
hash_len=32, # Output length
salt_len=16 # Salt length
)
def hash_password(password: str) -> str:
return ph.hash(password)
def verify_password(password: str, hash: str) -> bool:
try:
ph.verify(hash, password)
return True
except VerifyMismatchError:
return False
# Usage
hashed = hash_password("user_password")
# Stores as: $argon2id$v=19$m=19456,t=2,p=1$salt$hash
Salting: Defeating Rainbow Tables
A rainbow table is a precomputed lookup of hash values to passwords. Without salts, identical passwords produce identical hashes, making rainbow tables devastating.
A salt is random data added to each password before hashing. Even if two users have the same password, their hashes differ because their salts differ.
Critical salt requirements:
- Unique per password: Never reuse salts
- Cryptographically random: Use secure random generators
- Stored with the hash: You need it for verification
The good news: bcrypt and Argon2 handle salting automatically. The salt is embedded in the output string.
import bcrypt
import os
# MANUAL APPROACH (educational - don't do this in production)
def manual_salt_example():
password = "mysecretpassword"
# Generate cryptographically secure salt
salt = os.urandom(16)
# Combine and hash (simplified - real implementation is more complex)
import hashlib
salted = salt + password.encode()
hashed = hashlib.sha256(salted).digest()
# Must store both salt and hash
return salt + hashed # Concatenate for storage
# CORRECT APPROACH - let the library handle it
def proper_approach():
password = "mysecretpassword"
# bcrypt generates salt internally and embeds it in output
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt())
# The output contains: algorithm + cost + salt + hash
# Example: $2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/X4.V4rR9zdE
# ^^^^ ^^ ^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^
# algo cost salt hash
return hashed # Salt is included, no separate storage needed
Implementation Patterns and Common Pitfalls
Timing-Safe Comparison
String comparison typically short-circuits on the first mismatched character. This creates a timing side-channel that can leak information about the hash. Always use constant-time comparison:
import hmac
import bcrypt
def verify_password_secure(password: str, stored_hash: bytes) -> bool:
"""
bcrypt.checkpw is already timing-safe, but here's the principle
for when you need to compare hashes manually.
"""
# For bcrypt, just use the library
return bcrypt.checkpw(password.encode(), stored_hash)
def compare_hashes_manual(hash1: bytes, hash2: bytes) -> bool:
"""
Constant-time comparison for when you're comparing raw hashes.
Takes the same time regardless of where hashes differ.
"""
return hmac.compare_digest(hash1, hash2)
const crypto = require('crypto');
function compareHashesSafe(hash1, hash2) {
// Constant-time comparison
const buf1 = Buffer.from(hash1);
const buf2 = Buffer.from(hash2);
if (buf1.length !== buf2.length) {
return false;
}
return crypto.timingSafeEqual(buf1, buf2);
}
Upgrading Hashes Transparently
When migrating from weak algorithms (or increasing work factors), upgrade hashes during login when you have the plaintext password:
import bcrypt
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
ph = PasswordHasher()
def authenticate_and_upgrade(username: str, password: str) -> bool:
user = get_user(username)
if not user:
# Prevent timing attacks - do dummy hash
bcrypt.checkpw(b"dummy", bcrypt.gensalt())
return False
stored_hash = user.password_hash
# Check if it's a legacy bcrypt hash
if stored_hash.startswith('$2b$') or stored_hash.startswith('$2a$'):
# Verify with bcrypt
if not bcrypt.checkpw(password.encode(), stored_hash.encode()):
return False
# Upgrade to Argon2id
new_hash = ph.hash(password)
update_user_hash(username, new_hash)
return True
# Modern Argon2 hash
try:
ph.verify(stored_hash, password)
# Check if rehash needed (parameters changed)
if ph.check_needs_rehash(stored_hash):
new_hash = ph.hash(password)
update_user_hash(username, new_hash)
return True
except VerifyMismatchError:
return False
Error Handling Without Information Leakage
Never reveal whether a username exists through different error messages or timing:
def login(username: str, password: str) -> dict:
user = get_user(username)
if user is None:
# Do a dummy hash to prevent timing attacks
bcrypt.checkpw(b"dummy", bcrypt.gensalt())
# Same error message as wrong password
raise AuthenticationError("Invalid username or password")
if not verify_password(password, user.password_hash):
raise AuthenticationError("Invalid username or password")
return create_session(user)
Beyond Storage: Complementary Security Measures
Proper password hashing is necessary but not sufficient. Layer these additional protections:
Rate limiting: Limit login attempts per IP and per account. Five failed attempts in 15 minutes should trigger a delay or CAPTCHA.
Account lockout: Temporary lockouts after repeated failures. Be careful not to enable denial-of-service against legitimate users.
Breach detection: Check passwords against known breaches using the HaveIBeenPwned API (they support k-anonymity, so you don’t send full hashes).
Passwordless options: For high-security applications, consider WebAuthn/passkeys, magic links, or OAuth delegation. The most secure password is one that doesn’t exist.
Password security isn’t glamorous, but it’s foundational. Use bcrypt or Argon2id, let your library handle salting, upgrade hashes transparently, and layer additional protections. Your users are trusting you with their credentials—don’t let them down.