Rainbow Tables vs. Salting: The Crypto Showdown

Rainbow table attacks work by precomputing a massive database of hash values and their corresponding plaintexts, allowing attackers to quickly look up passwords.

Let’s see this in action. Imagine we have a simple password hashing function, MD5. We want to find the password for a given hash, say b10a8db164e0754105b7a99be72e3fe5.

First, an attacker would generate a large table. This isn’t just a simple list of every possible password and its hash; that would be astronomically large. Instead, rainbow tables use a clever technique involving "chains."

Here’s a simplified breakdown of how a chain works:

Start with a plaintext password: Let’s say "password123".
Hash it: MD5("password123") gives us some hash value.
Apply a reduction function: This function takes a hash and turns it back into a different potential plaintext (it’s not the original, but it looks like one). For example, if the hash is a1b2c3d4..., the reduction function might take the first few characters and convert them to a new password like "abcde".
Hash the new plaintext: MD5("abcde") gives another hash.
Repeat: Apply the reduction function again, hash again, and so on for a fixed number of steps (e.g., 1000 steps).

This creates a "chain" of alternating hashes and reduced plaintexts. A rainbow table stores only the start and end of many such chains.

Now, when an attacker gets a target hash (like b10a8db164e0754105b7a99be72e3fe5), they do this:

Apply the reduction function to the target hash.
Hash the result.
Apply the reduction function again.
Hash the result… (repeat the reduction/hash steps as many times as the chain length).
Check if the final hash is the end of any chain in their table. If it is, they’ve found a match!
If not, they "roll back" the chain: They take the target hash, apply the reduction function, hash, apply reduction, hash… until they reach the beginning of a chain in their table. They then regenerate that specific chain step-by-step until they find the hash that matches the target hash. The plaintext generated just before that matching hash is the original password.

The magic is that many different starting plaintexts can end up in the same chain, and the reduction functions are designed to make it computationally feasible to find these chains. This dramatically reduces the storage and computation required compared to a brute-force dictionary attack.

The problem this solves is the speed at which attackers can crack passwords, especially if they’re using common passwords or simple variations. Traditional brute-force attacks try every single possibility, which is incredibly slow for even moderately complex passwords. Rainbow tables pre-compute a huge amount of work, so the lookup for a given hash is very fast.

The core components are:

Hash Function: The algorithm used to transform the password into a fixed-size string of characters (e.g., MD5, SHA-1, SHA-256).
Reduction Function: A crucial part that maps a hash value back to a potential plaintext password. This function is designed to be different at each step of the chain to avoid collisions and ensure variety.
Chain: A sequence of alternating hash and reduction operations.
Rainbow Table: A database storing the starting point (initial plaintext) and ending point (final hash) of many precomputed chains.

The actual process of generating a rainbow table involves running millions or billions of these chains. For instance, a common rainbow table for MD5 passwords might have chains of length 1000, and the table itself might store the start and end points of 100 billion chains.

The most surprising thing is how the reduction function can be tailored. Instead of just taking a hash and spitting out a random-looking password, a good reduction function will "bias" its output towards common password characters or patterns. For example, if a hash value is 0x1A2B3C4D5E6F, the reduction function might take the lower 32 bits (0x5E6F) and convert them into a short alphanumeric string like aBcD. This ensures that the intermediate plaintexts generated by the reduction function are still somewhat plausible, making the chains more effective at covering the password space.

This is where salting comes in. A salt is a random, unique piece of data added to a password before it’s hashed. So, instead of hashing password, you hash password + salt. Each user gets a different salt.

Here’s why that breaks rainbow tables:

Unique Hashes: Because each password has a unique salt, password123 with saltA will have a different hash than password123 with saltB.
Precomputation is Useless: An attacker would have to precompute a rainbow table for every possible salt. This is impossible because the number of possible salts is astronomically huge. They can’t build a single table that works for all users.
Per-Password Attack: The attacker still has to perform a brute-force or dictionary attack on each individual hashed password, but now they can’t use their precomputed tables. They have to compute hashes on the fly, which is significantly slower.

If Alice’s password was "password123", hashed_password_alice would be hash("password123" + "a3b8c1d9"). If Bob’s password was also "password123", hashed_password_bob would be hash("password123" + "e5f7g3h1").

These two hashes are completely different. A rainbow table built for plain MD5 hashes would be useless against either. The attacker would have to try and guess "password123", append Alice’s salt "a3b8c1d9", hash it, and see if it matches hashed_password_alice. Then repeat for Bob.

The next step for an attacker would be to move from rainbow tables to more advanced precomputation attacks like Bloom filters or specialized hardware.