bcrypt: More Than Just Hashing

bcrypt is a password-hashing function that’s designed to be computationally expensive, making it a strong defense against brute-force attacks.

Let’s see it in action. Imagine a user trying to log in. Their password, "s3cr3tP@ssw0rd," is sent to the server. The server doesn’t store this plaintext password. Instead, it uses bcrypt to hash it.

import bcrypt

password = b"s3cr3tP@ssw0rd"
# Generate a salt and hash the password
hashed_password = bcrypt.hashpw(password, bcrypt.gensalt())

print(f"Plaintext password: {password.decode()}")
print(f"Hashed password: {hashed_password.decode()}")

Running this might produce:

Plaintext password: s3cr3tP@ssw0rd
Hashed password: b'$2b$12$z0K6yR7v.T8q7w3.L4s5lO9c1d2e3f4g5h6i7j8k9l0m1n2o3p'

When the user tries to log in again later, they enter "s3cr3tP@ssw0rd" again. The server takes this entered password, retrieves the stored hashed_password (which includes the salt), and runs bcrypt.checkpw():

import bcrypt

entered_password = b"s3cr3tP@ssw0rd"
stored_hashed_password = b'$2b$12$z0K6yR7v.T8q7w3.L4s5lO9c1d2e3f4g5h6i7j8k9l0m1n2o3p'

if bcrypt.checkpw(entered_password, stored_hashed_password):
    print("Password matches!")
else:
    print("Password does not match.")

If the entered password is correct, checkpw returns True. If it’s incorrect, it returns False. The magic is that checkpw extracts the salt from the stored hash and uses it to hash the entered password, then compares the two resulting hashes.

The core problem bcrypt solves is the inherent insecurity of storing passwords in plaintext or using reversible encryption. Before hashing, developers often used simple, fast, and reversible methods like MD5 or SHA-1. If an attacker gained access to the database, they had all the passwords. Even with salted hashes, if the hashing algorithm is too fast, attackers can try billions of passwords per second against a compromised database using specialized hardware (like GPUs). bcrypt was designed specifically to combat this by being deliberately slow.

Internally, bcrypt is based on the Blowfish cipher. It takes the password and a unique salt, then repeatedly encrypts the password using Blowfish in a specific, complex iterative process. The "cost factor" (the 12 in $2b$12$...) determines how many rounds of this expensive computation are performed. A higher cost factor means more computation, making it even slower and more resistant to brute-force attacks, but also taking longer on the server for legitimate logins. The salt, embedded within the resulting hash string, ensures that even identical passwords produce different hashes, preventing rainbow table attacks.

The most surprising thing about bcrypt’s design is that its primary strength comes from its inefficiency. Unlike most cryptographic primitives that aim for speed, bcrypt’s purpose is to consume significant CPU cycles, making offline attacks prohibitively expensive. This deliberate slowness is its fundamental security feature. The algorithm has evolved over time, with $2a$ and $2x$ being older prefixes, and $2y$ and $2b$ being more secure variants that address specific vulnerabilities found in earlier versions. Always use the $2b$ prefix, as it’s the most current and robust.

The next hurdle you’ll face is understanding how to securely manage and store these generated bcrypt hashes in your database.