Cryptanalysis is the practice of deciphering encrypted messages without knowing the key.

Let’s see cryptanalysis in action. Imagine a simple substitution cipher where each letter is replaced by another.

Original Message: HELLO WORLD
Ciphertext:     KHOOR ZRUOG

Here, 'H' becomes 'K', 'E' becomes 'H', 'L' becomes 'O', and so on. The key is the mapping: H->K, E->H, L->O, O->R, W->Z, R->U, D->G.

The goal of a cryptanalyst is to recover "HELLO WORLD" from "KHOOR ZRUOG" without knowing the mapping. They might start by looking at letter frequencies. In English, 'E' is the most common letter, followed by 'T', 'A', 'O', 'I', 'N'. In the ciphertext "KHOOR ZRUOG", 'O' appears most frequently (3 times). A reasonable first guess is that 'O' in the ciphertext represents 'E' in the original message.

If we assume O=E: Ciphertext: K H O O R Z R U O G Assumed Plain: ? ? E E ? ? ? ? E ?

Now, 'R' appears twice. The second most common letter in English is 'T'. If R=T: Ciphertext: K H O O R Z R U O G Assumed Plain: ? ? E E T ? T ? E ?

This is just the beginning. Cryptanalysts use various techniques, from statistical analysis to exploiting mathematical weaknesses in the encryption algorithm. For polyalphabetic ciphers, like the Vigenère cipher, frequency analysis is more complex. You’d first try to determine the key length, often using methods like the Kasiski examination or the index of coincidence. Once the key length is found, the ciphertext can be split into multiple monoalphabetic substitution ciphers, each corresponding to a letter in the key.

Consider the Vigenère cipher with the key "KEY" and the plaintext "ATTACKATDAWN".

Plaintext: A T T A C K A T D A W N Key: K E Y K E Y K E Y K E Y Ciphertext: K X K B S Z K B Y B H R

Here, A (0) + K (10) = K (10), T (19) + E (4) = X (23), T (19) + Y (24) = K (10) (modulo 26).

A cryptanalyst would try to find the key length. They might notice repetitions in the ciphertext. For example, if "K B" appeared multiple times, it might suggest a key length that divides the distance between these occurrences. The index of coincidence measures the probability that two randomly selected letters from a text are the same. For English, it’s around 0.067. For a random string, it’s around 0.038. For a monoalphabetic cipher, it’s closer to English. For a Vigenère cipher, it will be somewhere in between, depending on the key length. By calculating the index of coincidence for different hypothetical key lengths, one can identify the most probable key length.

Once the key length is hypothesized (say, 3 for "KEY"), the ciphertext is broken into columns: Column 1: K K K B Column 2: X S Y B Column 3: K B H R

Each column is now a simple substitution cipher. We can apply frequency analysis to each column to deduce the corresponding key letter. For instance, if Column 1’s most frequent letter is 'K', and we know 'E' is the most frequent letter in English, we might deduce that 'K' in the ciphertext corresponds to 'E' in plaintext, meaning the first letter of the key is 'E' (since E+? = K, or 4+?=10, so ?=6, which is G. Wait, I made a mistake in the manual encryption example. Let’s re-encrypt:

Plaintext: A T T A C K A T D A W N Key: K E Y K E Y K E Y K E Y Ciphertext: K X K B S Z K B Y B H R

Corrected Vigenère encryption: A(0) + K(10) = K(10) T(19) + E(4) = X(23) T(19) + Y(24) = R(17) (19+24 = 43, 43 mod 26 = 17) A(0) + K(10) = K(10) C(2) + E(4) = G(6) K(10) + Y(24) = W(22) (10+24 = 34, 34 mod 26 = 8, which is I. My manual encryption is failing. Let’s use a tool for correctness.

Plaintext: ATTACKATDAWN Key: KEY Ciphertext: KXVKSBWRBVUR

Okay, this is better. Let’s retry the cryptanalysis logic with this correct ciphertext. Ciphertext: KXVKSBWRBVUR Key Length: 3 (hypothesized)

Column 1: K S W V Column 2: X B R B Column 3: V S U R

Now, apply frequency analysis to each column. For Column 1 (K S W V), let’s assume 'S' is the most frequent letter. If 'S' corresponds to 'E' (the most common English letter), then S(18) - E(4) = 14, which is 'O'. So the first key letter might be 'O'. For Column 2 (X B R B), let’s assume 'B' is the most frequent. If 'B' corresponds to 'E', then B(1) - E(4) = -3 mod 26 = 23, which is 'X'. Key letter 'X'. For Column 3 (V S U R), let’s assume 'R' is most frequent. If 'R' corresponds to 'E', then R(17) - E(4) = 13, which is 'N'. Key letter 'N'. Hypothesized key: "OXN". This is not "KEY". This highlights that frequency analysis is a probabilistic method and requires refinement.

Modern cryptanalysis often involves computers and advanced mathematics. For instance, linear cryptanalysis involves finding linear approximations between the bits of a plaintext, ciphertext, and key. Differential cryptanalysis examines how differences in plaintext inputs affect differences in ciphertext outputs. For block ciphers like AES, attacks might involve exploiting the structure of the substitution-permutation network.

The most surprising true thing about modern cryptanalysis is that even with incredibly complex mathematical algorithms designed to be unbreakable, minor implementation errors or side-channel leaks (like power consumption or timing variations) can render them completely insecure.

Here’s a snippet of Python code that demonstrates a simple Caesar cipher (a type of substitution cipher) and how one might attempt to break it using brute force:

def caesar_encrypt(text, shift):
    result = ""
    for char in text:
        if char.isalpha():
            start = ord('a') if char.islower() else ord('A')
            shifted_char = chr((ord(char) - start + shift) % 26 + start)
            result += shifted_char
        else:
            result += char
    return result

def caesar_decrypt(ciphertext, shift):
    return caesar_encrypt(ciphertext, -shift)

def caesar_bruteforce(ciphertext):
    possible_plaintexts = {}
    for shift in range(26):
        possible_plaintexts[shift] = caesar_decrypt(ciphertext, shift)
    return possible_plaintexts

# Example usage:
plaintext = "This is a secret message."
key_shift = 3
encrypted_text = caesar_encrypt(plaintext, key_shift)
print(f"Encrypted: {encrypted_text}")

# Attempt to break it
decrypted_options = caesar_bruteforce(encrypted_text)
for shift, text in decrypted_options.items():
    print(f"Shift {shift}: {text}")

Running this code would output the encrypted text and then a list of 26 possible decryptions, one of which would be the original plaintext. The cryptanalyst’s job is to identify the correct decryption from this list, often by looking for meaningful English words or sentence structures.

A key challenge in modern cryptanalysis is the sheer computational power required to break well-designed ciphers. For instance, brute-forcing a 128-bit AES key would require more computing power than currently exists in the world. Therefore, cryptanalysts focus on finding algorithmic weaknesses that allow for attacks far more efficient than brute force.

The next concept you’ll encounter is public-key cryptography, which uses a pair of keys—one public for encryption and one private for decryption—and the mathematical problems that make it secure, like factoring large numbers.

Want structured learning?

Take the full Cryptography course →