2. Classical Cryptosystems
2.1. Overview of Classical Cryptosystems
We start with some classical cryptosystems. It’s useful to group these into a few different types. To do this, let’s introduce some more terminology before getting started. An \(n\)-gram is a sequence of \(n\) letters: a 1-gram is just a single letter, a 2-gram (also called a bigram) is a pair of letters, and so forth. The encryption strategies we’ll encounter below are the following:
Transposition involves rearranging units of plaintext according to some pattern. We’ll see just one example of this type of cipher below: rectangular transposition.
Substitition involves replacing units of plaintext with units of ciphertext. Substitution ciphers can be grouped further into some subtypes:
Simple substitution: In these ciphers, single letters of plaintext are replaced by ciphertext. The substitution scheme stays the same over the course of the entire message. We’ve already seen a simple substitution above: Trithemius’s Theban alphabet. The simple substitition ciphers we’ll meet below are:
- Masonic cipher.
- Caesar cipher.
- Affine cipher.
- Polybius square.
Polygraphic substitution: In these ciphers, groups of letters in the plaintext are replaced by ciphertext (a group of \(n\) letters is called an “\(n\)-gram”). The substitution scheme stays fixed over the entire message. The polygraphic substitition ciphers we’ll meet below are:
- Hill cipher.
- Playfair cipher.
Polyalphabetic substitution: In these ciphers, single letters in the plaintext are replaced by ciphertext, and the substitution scheme changes over the course of the message. The polyalphabetic substitition ciphers we’ll meet below are:
- Vignère cipher.
- One-time pad.
Most cryptosystems used in practice employ a combination of these strategies. For example, we’ll meet the ADFGVX cipher below; this was a cipher used by the Imperial German Army during World War I, and it combined a simple substitution (a Polybius square) with rectangular transposition.
2.2. Rectangular Transposition
Rectangular transposition (also called regular columnar transposition) is a transposition cipher: the ciphertext is obtained by permuting the letters of the plaintext in a particular pattern. The pattern is determined by a secret keyword. The strategy is best explained through example.
Suppose that Alice and Bob share the keyword GUARD
, and
that Alice wants to send the following message to Bob:
Hide! The baboons are coming for you.
As a first preliminary step, she removes spaces and punctuation and capitalizes all letters:
HIDETHEBABOONSARECOMINGFORYOU
This is the “encoded” message; notice that this is not yet remotely
secure! It must still be encrypted. Since the keyword GUARD
has 5 letters, she breaks up the message into 5-grams and stacks them
into rows:
HIDET
HEBAB
OONSA
RECOM
INGFO
RYOU
She then inserts nonsense letters at the end of the message to fill out any missing spaces:
HIDET
HEBAB
OONSA
RECOM
INGFO
RYOUQ
Now she rearranges the letters in each row. Observe that the
alphabetical rankings of the letters of the keyword GUARD
are 3, 5, 1, 4, 2. So she rearranges the letters in each row by putting
the first letter of each row in the 3rd position, the second letter of
each row in the 5th position, the third letter of each row in the 1st
position, the fourth letter of each row in the 4th position, and the
fifth letter of each row in the 2nd position.
DTHEI
BBHAE
NAOSO
CMROE
GOIFN
OQRUY
She then undoes the stacking to get the ciphertext:
DTHEIBBHAENAOSOCMROEGOIFNOQRUY
This ciphertext is then sent to Bob, who can undo this same process
using the keyword to recover the plaintext. Let’s work this out. Bob
first takes the letters of the ciphertext and stacks them into rows of
5, since GUARD
has 5 letters:
DTHEI
BBHAE
NAOSO
CMROE
GOIFN
OQRUY
Bob also knows that the alphabetical ranking of the letters of
GUARD
is 3, 5, 1, 4, 2. This means that, to decrypt the
message, he has to move the 3rd letter of each row to the first
position, the 5th letter to the second position, the 1st letter to the
third position, the 4th letter to the fourth position, and the 2nd
letter to the fifth position.
HIDET
HEBAB
OONSA
RECOM
INGFO
RYOUQ
He then undoes the stacking:
HIDETHEBABOONSARECOMINGFORYOUQ
He then has to make an educated guess that the letter Q
is nonsense and that the correctly punctuated message must be
Hide! The baboons are coming for you.
Now you try it!
Exercise Suppose Alice and Bob share the
keyword CRASH
. Mimic the process above to do the
following:
- Encrypt the message
There is always hope
. - Decrypt the message
ETIHGFREAFRSLAESOXOE
.
Once you have a hang of doing this by hand with short examples, here’s a SageCell you can play with to encrypt and decrypt longer messages using rectangular transposition.
Sage Exercise What does the SageCell above
do when you try setting your key word to be something that has a
repeated letter? Here’s something different that that would be
reasonable for the code to do when it encounters this situation: it
could decide to drop any repeats from the key word before proceeding
with the encryption. For example, if the user enters
BANANA
, it would instead use BAN
to encrypt
the message. Edit the code above so that this happens!
2.3. Masonic Cipher
The masonic cipher (also called the pigpen cipher or the tic-tac-toe cipher) is a simple substitution cipher that replaces individual letters with certain geometric shapes. It is called the masonic cipher because the Freemasons used it very frequently, but it likely originated with Hebrew rabbis. The encryption and decryption scheme of this cipher is described by the following diagram.
Each letter of the alphabet is replaced by the shape that encloses
that letter (possibly including the dot). For example, A
would be replaced by
and Q
would be replaced by
and U
would be replaced by
.
Exercise Encrypt the following messages using the masonic cipher:
MEET AT GEISEL
BONFIRE AT SUNSET
A very similar substitution cipher that replaces letters of the alphabet by geometric shapes is the Knights Templar Cipher.
Under the attack model we described in the introduction, this cryptosystem is extremely insecure. Recall that we typically assume that the adversary knows the decryption function, but does not know the key. For the masonic cipher, there is only a decryption function and no key! This means the adversary, who knows that a message was encrypted using a masonic cipher, can recover the plaintext just as easily as the intended recipient!
From this point of view, a slightly better simple substitution cipher is the one we discuss next, which at least does have a key.
2.4. Caesar Cipher
The Caesar cipher (also called a shift cipher) is
also a simple substitution cipher. The key for a Caesar cipher is a
shift, which can be any integer. For example, if we choose a shift of 3,
every letter gets shifted forward by 3 during encryption: A
becomes D
, B
becomes E
, and so
forth. What happens with X
? There is no letter that is 3
after X
, but we simply cycle back to beginning so that
X
becomes A
, Y
becomes
B
, and Z
becomes C
. This is
reversed during decryption.
Play with the slider to try out a few different shifts. What’s going on with a shift of 0? What about with a shift of 26? How does a shift of 27 compare with a shift of 1? What do negative shifts mean? What positive shift is a shift of \(-1\) the same as?
As with rectangular transposition, it is typical to first encode the message by removing all non-alphabetic characters and capitalizing everything. Thus a message like
University of California San Diego
would get encoded to:
UNIVERSITYOFCALIFORNIASANDIEGO
and we then apply a shift of 3 to get the ciphertext:
XQLYHUVLWBRIFDOLIRUQLDVDQGLHJR
Exercise
- Using a shift of 3, encrypt the message
Meet at La Jolla Shores
. - Using a shift of 3, decrypt the message
PHHWDWVXQJRGODZQ
. - Using a shift of 10, encrypt the message
On the Ides of March
.
Exercise You are Eve. You have just intercepted the following message that Alice was trying to send to Bob:
Q TQDM IB QPWCAM
You know that Alice used a Caesar cipher, but she didn’t remove
spaces before encrypting: she left the spaces in her original message
as-is. What is the original message? Hint. Apparently just the
single letter Q
is the encryption of a word; what word(s)
could it be?
It might feel a little tedious to do these encryptions and decryptions by hand, and it is. But computers can do this very quickly! To understand how, we pause our discussion of ciphers to discuss a fundamental idea in number theory: modular arithmetic.
2.5. Interlude: Modular Arithmetic
Exercise Suppose it is 11:10am.
- Your friend Alice asks you if you want to get lunch together in 2 hours. What time is she proposing having lunch?
- Your friend Bob tells you that he just woke up 15 minutes ago. What time did he wake up?
If you can do this exercise, you already know modular arithmetic at an intuitive level! What we’re going to do here is introduce some notation to be able to talk about this “clock arithmetic” that you all already know in a rigorous way.
Quotients and Remainders
When an integer is divided by a positive integer, we get a quotient and a remainder. For example, when 17 is divided by 5, we get a quotient of 3 and a remainder of 2. The formal statement is the following:
Euclid’s Division Lemma For any integer \(a\) and positive integer \(n\), there exists a unique pair of integers \(q\) and \(r\) such that \(0 \leq r < n\) and \(a = qn + r\). The integers \(q\) and \(r\) are called the quotient and remainder, respectively. We also write \(a \bmod{n}\) to refer to the remainder.
The basic idea of the proof for the existence statement is that you keep subtracting (or adding) \(n\) from \(a\) until you wind up in the range \([0, n)\). The number of times you had to subtract (or add) \(n\) is the quotient, and the number in the range \([0, n)\) that you wind up with at the end is your remainder.
This is likely to be very familiar to you when \(a \geq 0\). Let’s consider again the example of dividing \(a = 17\) by \(n = 5\). Subtracting \(5\) once yields \(12\), twice yields \(7\), and thrice yields \(2\). Since we’ve reached a number in the range \([0, 5)\), we stop subtracting. The number \(2\) that we ended up with is the remainder, and since we had to subtract \(5\) thrice, the quotient is \(3\).
This may be less familiar to you when \(a < 0\), but it works the same way; we just have to add instead! Suppose, for example, that we want to divide \(a = -7\) by \(n = 5\). Adding \(5\) once gives us \(-2\) and twice gives us \(3\). Thus the quotient is \(-2\) (negative because we had add 5, instead of subtracting \(5\) like we did above), and the remainder is \(3\).
Exercise For each of the following, calculate the quotient and remainder when \(a\) is divided by \(n\). Do these calculations by hand.
- \(a = 13, n = 3\)
- \(a = 134, n = 10\)
- \(a = -37, n = 10\)
- \(a = -15, n = 60\)
- \(a = 13, n = 12\)
//
operator:
%
operator:
Suppose \(a\) and \(n\) are integers and \(n > 0\). All of the following sentences mean exactly the same thing:
- \(a \bmod{n} = 0\).
- There is no remainder when \(a\) is divided by \(n\).
- \(a\) is a multiple of \(n\).
- \(a\) is divisible by \(n\).
- \(n\) is a divisor of \(a\).
- \(n\) is a factor of \(a\).
- \(n\) divides \(a\) (in notation: \(n \mid a\), where the vertical bar is read as “divides”).
- \(a/n\) is an integer.
You’re probably already familiar with some of these, but it will be beneficial for you to become comfortable with all of them. It’ll feel strange at first that we have so many ways of saying exactly the same thing, but it’s surprisingly useful.
Congruences
Definition Fix a positive integer \(n\). If \(a\) and \(b\) are integers, we say that “\(a\) is congruent to \(b\) mod \(n\),” or that “\(a\) and \(b\) are congruent mod \(n\),” if \(a\) and \(b\) have the same remainder when each is divided by \(n\). This can be denoted in symbols as follows:
\[ a \equiv b \pmod{n} \]
For example, we have \(19 \equiv 7 \pmod{4}\) since \(19\) and \(7\) both have remainder \(3\) when divded by \(4\). Observe also that \(19 - 7 = 12\) is a multiple of 4. This observation is generalized by the following:
Lemma Fix a positive integer \(n\). Two integers \(a\) and \(b\) are congruent mod \(n\) if and only if \(a-b\) is a multiple of \(n\).
Proof of lemma. Divide \(a\) and \(b\) by \(n\) to write \(a = q_1n + r_1\) and \(b = q_2n + r_2\). If \(a \equiv b \pmod{n}\), this by definition means that \(r_1 = r_2\), so \[ a-b = (q_1n + r_1) - (q_2n + r_2) = q_1n - q_2n = n(q_1 - q_2), \] so \(a-b\) is a multiple of \(n\). Conversely, suppose \(a-b\) is a multiple of \(n\). Then \[ (a-b) - (q_1 - q_2)n = ((q_1n + r_1) - (q_2 n + r_2)) - (q_1 - q_2)n = r_1 - r_2 \] is a multiple of \(n\) also. But \(0 \leq r_1, r_2 < n\), so we must have \(|r_1 - r_2| < n\) and the only way that \(r_1 - r_2\) can be a multiple of \(n\) is if \(r_1 - r_2 = 0\), ie, if \(r_1 = r_2\). This means that \(a \equiv b \pmod{n}\). \(\Box\)
Here’s a crucial property of congruences that we’ll find ourselves using frequently.
Modular Arithmetic Theorem Fix a positive integer \(n\). Suppose \(a, a', b, b'\) are integers such that \(a \equiv a' \bmod{n}\) and \(b \equiv b' \bmod{n}\), and \(k\) is any positive integer. Then all of the following are also true: \[ \begin{aligned} a + b &\equiv a' + b' &&\pmod{n} \\ a - b &\equiv a' - b' &&\pmod{n} \\ ab &\equiv a'b' &&\pmod{n} \\ a^k &\equiv (a')^k &&\pmod{n} \end{aligned} \]
Proof. Let \(r\) and \(s\) be integers such that \(a - a' = nr\) and \(b - b' = ns\). Then \[(a + b) - (a' + b') = (a - a') + (b - b') = nr + ns = n(r+s)\] is a multiple of \(n\), so \(a+b \equiv a' + b' \pmod{n}\). The proof for subtraction is almost identical and is omitted. For multiplication, observe that \[ ab - a'b' = ab - ab' + ab' - a'b' = a(b-b') + (a-a')b' = ans + nrb' = n(as + rb') \] is also a multiple of \(n\), so \(ab \equiv a'b' \pmod{n}\). The final statement regarding exponentiation follows by inducting on the statement about multiplication. \(\Box\)
You may notice that division is missing in this theorem. This is because division in modular arithmetic is a little subtle; we’ll return to this soon.
The upshot of this theorem is that, if we’re just interested in remainders after dividing by \(n\), we can “take remainders mod \(n\)” at every step of our addition/subtraction/multiplication to make our calculations easier. For example, suppose we want to calculate \(123 \cdot 24 \bmod{10}\). We could do this by calculating \(123 \cdot 24 = 2952\) and then taking a remainder to get 2, but that multiplication is not so easy to do in your head. It is easier to observe that \(123 \equiv 3 \pmod{10}\), and \(24 \equiv 4 \pmod{10}\), so the theorem tells us that \[ 123 \cdot 24 \equiv 3 \cdot 4 = 12 \equiv 2 \pmod{10}. \] In other words, we have reached the same answer without having to multiply large numbers together!
Exercise
Use the Modular Arithmetic Theorem to quickly calculate the following.
- \(417 \cdot 22 \bmod{10}\).
- \(333333 + 666 \bmod{3}\).
- \(7^{202320232023} \bmod{6}\).
Exercise Fix positive integers \(k\) and \(n\). Suppose \(a\) and \(a'\) are integers such that \(a \equiv a' \pmod{n}\). It is not true in general that \(k^a \equiv k^{a'} \pmod{n}\). Show this by example: in other words, find \(k\), \(n\), \(a\), and \(a'\) such that \(a \equiv a' \pmod{n}\), but \(k^a \not\equiv k^{a'} \pmod{n}\).
Harder Exercise Suppose that the number \(273x49y5\), where \(x\) and \(y\) are unknown digits, is divisible by 495. Find \(x\) and \(y\).
Caesar Cipher Revisited
Imagine that we want to apply the Caesar cipher with a shift of 5 to
encrypt the letter Y
. Before we see how the Caesar cipher
relates to modular arithmetic, convince yourself that the encryption of
Y
is the letter D
.
Suppose we identify the letters A
through Z
with the numbers 0 through 25, in order. So, for example, A
corresponds to 0, B
corresponds to 1, and so forth. The
letter Y
that we are hoping to encrypt corresponds to 24.
Now consider the function \[ E(x) = (x+5)
\bmod{26}. \] Then \(E(24) = (24+5)
\bmod{26} = 29 \bmod{26} = 3\), which corresponds precisely to
the letter D
. In other words, if we identify letters with
numbers, the function \(E\) is
precisely the encryption function of the Caesar cipher. As you might
expect, the decryption function is \[ D(y) =
(y-5) \bmod{26}. \] For example, to decrypt the letter
D
, we look at the corresponding number 3, and \(D(3) = (3-5) \bmod{26} = -2 \bmod{26} =
24\), which corresponds to Y
.
The fact that adding 5 mod 26 and subtracting 5 mod 26 are are inverse operations might seem obvious, but it’s worth noting that it is actually a consequence of the Modular Arithmetic Theorem. Set \(y = E(x)\) and observe that \(y \equiv x+5 \pmod{26}\) and that \(D(y) \equiv y-5 \bmod{26}\). Thus \[ \begin{aligned} D(E(x)) &= D(y) \\ &\equiv (y - 5) \pmod{26} \\ &\equiv ((x+5)-5) \pmod{26} \\ &= x. \end{aligned} \] We have used the Modular Arithmetic Theorem for the second-to-last step. There is, of course, nothing special about the shift 5 in this argument; any other shift works exactly the same way.
The upshot is that the Caesar cipher “is” just modular arithmetic, and computers can do modular arithmetic very quickly.
TLLAHANYHMMPAPOHSSHATPKUPNOA
. Here is some code that
implements the Caeser cipher; hit “Run.” What is the plaintext message?
Once you’ve decrypted the message, play with the SageCell! You can
use it to encrypt as well as decrypt messages using the Caesar cipher
with any shift. What is the plaintext corresponding to
VSSWIZIPX
with a shift of 4? What is the ciphertext for the
message Muir College
with a shift of 2?
At this point, we’re also ready to start thinking about codebreaking for a Caesar cipher. We’ll develop more machinery later on that’ll help us think more deeply and systemtically about this.
Exercise You have just intercepted the following ciphertext that was sent out to a politically active student group on campus:
KLGJELZWJWYWFLKEWWLAFYSFVVWESFVVWUSJTGFARSLAGF
You know that the message was encrypted using a Caesar cipher, but you don’t know the shift. What is the plaintext? Hint. There are “only” 26 possible shifts to try…!
2.6. Interlude: GCDs
Definition The greatest common divisor (or GCD) of two integers \(a\) and \(b\) that are not both zero is denoted \(\gcd(a, b)\) and is defined to be the largest integer which is both a divisor of \(a\) and a divisor of \(b\).
For example, suppose we’re interested in \(\gcd(14, 21)\). The factors of 14 are 1, 2, 7, and 14. The factors of 21 are 1, 3, 7, and 21. The largest one that both have in common is 7, so \(\gcd(14, 21) = 7\).
What we just did, while intuitive, is actually not the best way of finding GCDs. Finding the factors of a number is, in general, very difficult; in fact, we’ll see later that the difficulty of factoring can be used to create cryptosystems that are secure even by modern standards! But calculating GCDs is not difficult, thanks to the euclidean algorithm, which we will get to shortly.
Exercise Suppose \(a\) is a nonzero integer. What is \(\gcd(a, 0)\)?
Note. Make sure that your answer works for negative values of \(a\) also!
Euclidean Algorithm
The euclidean algorithm for computing GCDs relies on the following observation.
Lemma Let \(n\) be a positive integer and \(a \equiv b \pmod{n}\). Then \(\gcd(a, n) = \gcd(b, n)\).
Proof. Define \(c = \gcd(a, n)\) and \(d = \gcd(b, n)\). Let \(k\) be an integer such that \(a - b = nk\). Since \(c\) is a factor of both \(a\) and \(n\), it is also a factor of \(a - nk = b\). Thus \(c\) is a common factor of both \(b\) and \(n\) as well, so \(c \leq d\) by definition of \(d\). On the other hand, the same logic shows that \(d\) is a common factor of both \(a\) and \(n\), so \(d \leq c\). Thus \(d = c\). \(\Box\)
Corollary Let \(n\) be a positive integer and let \(r\) be the remainder when an integer \(a\) is divided by \(n\). Then \(\gcd(a, n) = \gcd(r, n)\).
Euclidean Algorithm Suppose \(a\) and \(b\) are two positive integers, and assume without loss of generality that \(b \geq a\). To find \(\gcd(a, b)\), do the following:
Divide \(b\) by \(a\) and let \(r\) be the remainder. Then:
- If \(r = 0\), output \(a\).
- Otherwise, replace \(b\) with \(a\) and \(a\) with \(r\). Then repeat.
Let’s work out an example. Say we want to compute \(\gcd(115, 35)\). We divide the bigger number by the smaller one and we get
\[ 115 = 3 \cdot 35 + 10. \]
The remainder is nonzero, so we’ll divide again, but this time, we’ll divide the dividend (35) by the remainder (10). We get
\[ 35 = 3 \cdot 10 + 5. \]
The remainder is nonzero again, so we divide again:
\[ 10 = 2 \cdot 5 + 0. \]
This time, the remainder is 0, so we output the dividend: ie, 5. We have just calculated that \(\gcd(115, 35) = 5\). We did not have to factor 115 or 35 to do this; all it took was three divisions!
Exercise Calculate the following GCDs using the euclidean algorithm:
- \(\gcd(180, 120)\).
- \(\gcd(180, 81)\).
- \(\gcd(121, 77)\).
This algorithm dates back to Euclid’s Elements (c. 300 BCE), but it’s incredibly efficient: it’s still used in modern programming libraries to implement GCD calculations! In particular:
Bézout’s Theorem
Here’s a theorem that looks a little strange at first glance, but turns out to be surprisingly useful.
Bézout’s Theorem Suppose \(a\) and \(b\) are integers not both 0. Then \(\gcd(a, b)\) can be written as an integer linear combination of \(a\) and \(b\), ie, it can be written as \(ax + by\) for some integers \(x\) and \(y\). Integers \(x\) and \(y\) such that \(\gcd(a, b) = ax + by\) are called Bézout coefficients.
In fact, the euclidean algorithm tells us how to find Bézout coefficients! Let’s work this out in an example. Suppose we’re interested in \(\gcd(115, 35)\) again. We saw above that the following sequence of divisions showed us that \(\gcd(115, 35) = 5\).
\[ \begin{aligned} 115 &= 3 \cdot 35 + 10 \\ 35 &= 3 \cdot 10 + 5 \\ 10 &= 2 \cdot 5 + 0 \end{aligned} \]
If we rearrange the second equation, it tells us that \[ 5 = 35 - 3 \cdot 10. \] The first equation tells us that \(10 = 115 - 3 \cdot 35\), and we can plug this into the equation above to get \[ 5 = 35 - 3 \cdot (115 - 3 \cdot 35) = (-3) \cdot 115 + 10 \cdot 35. \] We’ve just written the GCD of 115 and 35 as an integer linear combination of these two numbers!
Let’s do another longer example. Suppose we’re interested in \(\gcd(240, 46)\). Here is the sequence of divisions we would perform when running the euclidean algorithm:
\[ \begin{aligned} 240 &= 5 \cdot 46 + 10 \\ 46 &= 4 \cdot 10 + 6 \\ 10 &= 1 \cdot 6 + 4 \\ 6 &= 1 \cdot 4 + 2 \\ 4 &= 2 \cdot 2 + 0 \end{aligned} \]
This tells us that \(\gcd(240, 46) = 2\). To compute Bézout coefficients, we work backwards:
\[ \begin{aligned} 2 &= 6 - 1 \cdot 4 \\ &= 6 - 1 \cdot (10 - 1 \cdot 6) \\ &= 2 \cdot 6 - 10 \\ &= 2 \cdot (46 - 4 \cdot 10) - 10 \\ &= 2 \cdot 46 - 9 \cdot 10 \\ &= 2 \cdot 46 - 9 \cdot (240 - 5 \cdot 46) \\ &= 47 \cdot 46 - 9 \cdot 240. \end{aligned} \]
We start by rewriting the division for the last nonzero remainder. We then alternate between substitution for the remainder directly above, and simplifying.
This process, of “working backwards” through the sequence of divisions we did during the euclidean algorithm, always works: the result will always be the GCD written as a linear combination of the two numbers we started with. This process is called the extended euclidean algorithm.
Exercise Calculate Bézout coefficients for the following GCDs using the extended euclidean algorithm:
- \(\gcd(180, 120)\).
- \(\gcd(180, 81)\).
- \(\gcd(121, 77)\).
It’s worth remarking that the extended euclidean algorithm gives us one pair of Bézout coefficients, but will be others; Bézout coefficients are not unique!
Exercise Observe that \(\gcd(42, 12) = 6\). Show that the pairs \((-1, 4)\) and \((1, -3)\) are both Bézout coefficients for 42 and 12.
xgcd
. The output is a
triple where the first element is the GCD, and the second and third are
the Bézout coefficients. Again, it’s very fast, even with huge numbers:
Here is a statement that might feel intuitive, but requires proof. The proof is a nice application of Bézout’s theorem.
Euclid’s Lemma Suppose \(a\) and \(b\) are integers and that \(d\) is a divisor of \(ab\) such that \(\gcd(a, d) = 1\). Then \(d\) is a divisor of \(b\).
Proof. By Bézout’s theorem, there exist \(x\) and \(y\) such that \(1 = \gcd(a, d) = ax + dy\). Multiplying this equation by \(b\), we see that \[ b = abx + bdy. \] Observe that \(d\) divides \(bdy\), and it also divides \(abx\) since it divides \(ab\). Thus \(d\) divides the sum \(abx + bdy = b\). \(\Box\)
Exercise Explain why Euclid’s lemma as stated above implies the following: if \(a\) and \(b\) are integers, \(p\) is prime, and \(p\) divides \(ab\), then either \(p\) divides \(a\) or \(p\) divides \(b\).
Modular Inversion
Before discussing modular inversion, let’s take a step back. How do we solve an equation like the following for \(z\)?
\[ 5z = 7 \]
We divide both sides by \(5\), of course! Stated differently, we multiply both sides by \(1/5\). And why do we do that? It’s because \((1/5) \cdot 5 = 1\), so by multiplying both sides by \(1/5\), we are able to cancel out the 5 that appears on the left-hand side of our equation and isolate the variable \(z\).
Modular inversion gives us a way of recreating this process with congruences. Before jumping into the definitions, let’s look at an example. Suppose we’re interested in solving the congruence
\[ 5z \equiv 7 \pmod{11}. \]
It doesn’t make sense to directly “divide both sides by \(5\),” because congruences only make sense when both sides of the congruence are integers and \(7/5\) is not an integer. But if we can find an integer \(x\) with the property that \(5x \equiv 1 \pmod{11}\), then we could multiply both sides of our congruence by \(x\) and eliminate the \(5\) on the left-hand side just like we did earlier (using the Modular Arithmetic Theorem!). And in fact there is such an integer \(x\) – consider, for example, \(x = 9\). Then \(5x = 9 \cdot 5 = 45 \equiv 1 \pmod{11}\), so if we multiply both sides of our congruence by \(9\), we get
\[ z = 1 \cdot z \equiv (5 \cdot 9) z = 9 \cdot (5 z) \equiv 9 \cdot 7 \pmod{11}. \]
Thus
\[ z \equiv 9 \cdot 7 = 63 \equiv 8 \pmod{11} \]
and we’ve solved our congruence! The solution is \(z \equiv 8 \pmod{11}\).
In this example, we magicked the number \(x = 9\) out of thin air, but there is in fact a systematic way of finding this number which we will get to shortly. Let us first make some relevant definitions and observations.
Definition Fix a positive integer \(n\). An integer \(a\) is invertible mod \(n\) (or a unit mod \(n\)) if there exists another integer \(x\) such that \(ax \equiv 1 \pmod{n}\). The number \(x\) is then called an inverse of \(a\) mod \(n\), and in symbols, one writes \(a^{-1} \equiv x \pmod{n}\).
For example, we saw above that \(9 \equiv 5^{-1} \pmod{11}\), because \(5 \cdot 9 \equiv 1 \pmod{11}\).
It is important to note that inverses don’t always exist.
Exercise Explain why 2 is not invertible mod 4.
The following theorem characterizes exactly when inverses exist, and also tells us how to find inverses!
Modular Inversion Theorem Fix a positive integer \(n\) and another integer \(a\). Then \(a\) is invertible mod \(n\) if and only if \(\gcd(a, n) = 1\). Moreover, if \(\gcd(a, n) = 1\) and \(x\) and \(y\) are Bézout coefficients for \(a\) and \(n\), then \(x\) is an inverse of \(a\) mod \(n\).
Proof. Suppose first that \(a\) is invertible mod \(n\). Then there exists an integer \(x\) such that \(ax \equiv 1 \pmod{n}\). This means that \(ax - 1\) is a multiple of \(n\), so there exists an integer \(y\) such that \(ax - 1 = ny\), or, in other words, \(ax - ny = 1\). Since \(a\) and \(n\) are both divisible by \(\gcd(a, n)\), the left-hand side of the equation \(ax - ny = 1\) is divisible by \(\gcd(a, n)\), so the right-hand side must be divisible by \(\gcd(a, n)\) also. But the only positive factor of 1 is 1 itself, so \(\gcd(a, n) = 1\).
Conversely, suppose \(\gcd(a, n) = 1\). By Bézout’s theorem, there exist \(x\) and \(y\) such that \(ax + ny = 1\). This means that \(ax - 1\) is a multiple of \(n\), so \(ax \equiv 1 \pmod{n}\). Thus \(x\) is an inverse of \(a\). \(\Box\)
In other words, this theorem tells us that the euclidean algorithm is again what we use to figure out if a number is invertible mod \(n\), and to figure out what the inverse is.
For example, suppose we’re interested in the inverse of 7 mod 23. We use the euclidean algorithm to compute \(\gcd(23, 7)\).
\[ \begin{aligned} 23 &= 3 \cdot 7 + 2 \\ 7 &= 3 \cdot 2 + 1 \\ 2 &= 2 \cdot 1 + 0 \end{aligned} \]
This tells us that \(\gcd(23, 7) = 1\), so 7 is in fact invertible mod 23. Moreover, working backwards, we see that
\[ \begin{aligned} 1 &= 7 - 3 \cdot 2 \\ &= 7 - 3 \cdot (23 - 3 \cdot 7) \\ &= 10 \cdot 7 - 3 \cdot 23 \end{aligned} \]
so the Modular Inversion Theorem tells us that 10 is the inverse of 7 mod 23.
Exercise For each of the following, determine whether \(a\) is invertible mod \(n\). If it is, find an inverse of \(a\) mod \(n\).
- \(a = 14, n = 21\)
- \(a = 3, n = 7\)
- \(a = 41, n = 50\)
Exercise Solve the following congruences for \(z\).
- \(2z \equiv 3 \pmod{11}\)
- \(3z \equiv 2 \pmod{7}\)
- \(5z \equiv 3 \pmod{15}\)
- \(5z \equiv 17 \pmod{101}\)
We remarked earlier that Bézout coefficients aren’t unique, and inverses aren’t strictly unique either. Notice, for example, that \(3 \cdot 2 \equiv 1 \bmod{5}\) and \(8 \cdot 2 \equiv 1 \bmod{5}\), so 8 and 3 are both inverses of 2 mod 5. However, notice that \(8 \equiv 3 \bmod{5}\). In other words, inverses are “kind of” unique when they exist – they are unique mod \(n\). This is what the following result says:
Lemma Fix a positive integer \(n\) and suppose \(a\) is invertible mod \(n\). If \(x\) and \(x'\) are both inverses of \(a\) mod \(n\), then \(x \equiv x' \pmod{n}\).
Proof. By definition of inverse, we know that \(ax \equiv 1 \bmod{n}\). Multiplying both sides by \(x'\) and applying the Modular Arithmetic Theorem, we see that \[ x = 1 \cdot x \equiv (x' a)x = x'(ax) \equiv x' \cdot 1 = x' \pmod{n}, \] which is what we wanted to show. \(\Box\)
inverse_mod
. You have to pass it two
arguments: the number you want the inverse of, and then the modulus. For
example, the following computes \(7^{-1}
\bmod{23}\). Again, it’s fast: try it with big numbers!
That’s plenty of modular arithmetic for now. Let’s get back to describing some more ciphers!
2.7. Affine Cipher
We’ve seen that, if we identify the letters A
through
Z
in order with the numbers 0 through 25 in order, the
encryption function for the Caesar cipher is \(E(x) = (x + b) \bmod{26}\), where \(b = 0, 1, \dotsc, 25\) is the shift. We now
generalize this. An affine cipher is one whose encryption function is of
the form
\[ E(x) = (ax + b) \bmod{26} \]
for some integers \(a\) and \(b\), which form the key. As we will see
shortly, not any pair of integers \(a\)
and \(b\) will result in an affine
encryption function above, but let us first consider an example. Suppose
\(a = 3\) and \(b = 5\), so that our encryption function is
\(E(x) = (3x + 5) \bmod{26}\). To
encrypt the letter Y
, we compute
\[ E(24) = (3 \cdot 24 + 5) \bmod{26} = (72 + 5) \bmod{26} = 77 \bmod{26} = 25. \]
Thus the encryption of Y
is the letter Z
,
which corresponds to 25.
Exercise Keeping the same encryption
function as above with \(a = 3\) and
\(b = 5\), what is the encryption of
A
? What is the encryption of D
?
How does decryption work? Let’s start by stating the general result.
Affine Cipher Lemma Suppose \(E : \{0, \cdots, 25\} \to \{0, \cdots,
25\}\) is a function of the form \[
E(x) = ax + b \bmod{26} \] for some integers \(a\) and \(b\). Then there exists a function \(D : \{0, \cdots, 25\} \to \{0, \cdots,
25\}\) such that \(D(E(x)) = x\)
if and only if \(a\) is invertible mod
26. Moreover, if \(c \equiv a^{-1}
\pmod{26}\), then
\[ D(y) = c(y-b) \bmod{26}. \]
Let’s go back to our example with \(a =
3\) and \(b = 5\). Using
inverse_mod(3, 26)
in Sage, I find that \(9 \equiv 3^{-1} \pmod{26}\), so the Affine
Cipher Lemma tells us that the decryption function must be given by
\[ D(y) = 9(y - 5) \bmod{26}. \]
For example, suppose I want to decrypt the letter Z
,
which corresponds to \(y = 25\). We
have \(D(25) = 9(25 - 5) \bmod{26} = 9 \cdot
20 \bmod{26} = 180 \bmod{26} = 24\), which corresponds to
Y
, exactly as we would have expected!
Exercise Alice and Bob are using the same
affine encryption function as above, with \(a
= 3\) and \(b = 5\). Bob has
just received the message LNKRLFKH
. Decrypt the
message.
Exercise Suppose the encryption function for an affine cipher is \(E(x) = 5x + 17 \bmod{26}\). What is the corresponding decryption function \(D\)?
Proof Exercise Prove the Affine Cipher Lemma.
Notice that, of the numbers between 0 and 25, there are exactly 12 that are invertible mod 26 (the odds except 13, ie, 1, 3, 5, 7, 9, 11, 15, 17, 19, 21). This means that the number of pairs \((a, b)\) such that \(E(x) = ax + b \bmod{26}\) is a legitimate encryption function for an affine cipher is \(12 \cdot 26 = 312\). This is quite a bit larger than 26, and is probably already large enough that it would be a little tedious to try and check all possible keys to see if any result in something meaningful. That would probably only be worth doing if you knew the message had some very valuable information.
In fact, we can increase the number of keys with a simple substitution cipher even more dramatically, as we’ll see next. Meanwhile, here’s a SageCell to play with!
VYOHVUTWVHPZUWTZUUBEYDUB
, encrypted using the affine cipher
with \(a = 5\) and \(b = 7\). Here is some code that implements
the affine cipher. What is the message?
Exercise The Atbash cipher is a
simple substitution cipher in which encryption and decryption both
simply reverse the order of the alphabet. In other words, A
and Z
are interchanged, B
and Y
are interchanged, and so forth. For example, the plaintext
APPLE
corresponds to the ciphertext ZKKOV
.
Show that the Atbash cipher is a special case of the affine cipher. What
are the corresponding values of \(a\)
and \(b\)?
Exercise
- Make sense of and justify the following statement: “Two affine ciphers in succession result in just another affine cipher.”
- Is it possible for “two affine ciphers in succession” to result in a Caesar cipher? Explain.
2.8. Simple Substitution
We can describe a general simple substitution cipher (also called simple monoalphabetic substitition cipher or a monoalphabetic substitution cipher) by using a full conversion table as a key. For example, we might use a table like the following:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P | V | J | W | D | C | H | T | S | K | Z | F | N | Q | E | Y | O | R | I | G | A | U | M | L | X | B |
This tells us to convert every instance of A
into
P
, every instance of B
into V
,
and so forth, and to go backwards during decryption. For example,
suppose Alice wants to encrypt the message:
You must destroy all of the horcruxes!
She again starts by encoding the message by removing spaces and punctuation and capitalizing everything:
YOUMUSTDESTROYALLOFTHEHORCRUXES
She then converts each letter using the table:
XEANAIGWDIGREXPFFECGTDTERJRALDI
This is the ciphertext she sends to Bob. To decrypt the message, Bob uses the same table backwards.
Notice that, if the entire table is our key, the number of possible keys is
\[ 26! = 403291461126605635584000000, \]
ie, about 403 septillion. There are now far too many to try all of them!
In spite of the massive increase in the number of keys, we’ll see later that simple substitution can still be easily broken using some ideas from probability theory. Meanwhile, let’s get some practice with this:
Exercise Using the same table given above, do the following by hand:
- Encrypt the message
The moon is pitted with holes!
- Decrypt the message
TEMPRDXEAWESQHGEWPX
.
2.9. Polybius Square
The Polybius square is yet another simple substitution cipher, invented in Ancient Greece and made famous by the scholar Polybius (200–118 BCE). The same strategy was used in Japan by Uesugi samurai clan. During World War I, the German Imperial Army combined a Polybius square with rectangular transposition, resulting in the so-called ADFGVX Cipher.
The Polybius square is still a simple substitution cipher in that it replaces single letters of plaintext with ciphertext and the substitution remains fixed over the entire message, but there is a a minor twist: each letter of plaintext is replaced by two letters of ciphertext. Nonetheless, we’ll see later that a Polybius square can be broken in exactly the same way as the simple substitution ciphers described in the previous section.
The key for a Polybius square is a table with labeled rows and
columns, and the alphabet for the messages we’re encrypting lives inside
the table. For example, suppose the alphabet we’re encrypting includes
the capital letters A
through Z
and the digits
0
through 9
. That’s 36 letters, which will fit
perfectly inside a \(6 \times 6\) grid.
Following the Imperial German Army during World War I, let’s label the
rows and columns with the letters ADFGVX
to get a table as
follows.
A | D | F | G | V | X | |
---|---|---|---|---|---|---|
A | N | A | 1 | C | 3 | H |
D | 8 | T | B | 2 | O | M |
F | E | 5 | W | R | P | D |
G | 4 | F | 6 | G | 7 | I |
V | 9 | J | 0 | K | L | Q |
X | S | U | V | X | Y | Z |
This table is the key. To encrypt a message, we convert each letter
in our plaintext to a pair of letters indicating the row and column of
that letter in the table above. For example, the letter K
would be replaced by VG
because K
appears in
row V
and column G
. Similarly, the letter
S
would be replaced by XA
. Suppose that Alice
wants to encrypt the message
Storm the gates at 14:37.
She first encodes by dropping punctuation and capitalizing:
STORMTHEGATESAT1437
She then goes through and replaces each letter by the corresponding pair as we described above:
XADDDVFGDXDDAXFAGGADDDFAXAADDDAFGAAVGV
This is the ciphertext. Bob, who knows the table, can then undo this process to decrypt the message.
Exercise Using the square given above:
- Encrypt the message
High tide at 7:01am
. - Decrypt the message
XAAAADVGFAFVADDDAXADDDDGFVDX
.
2.10. Interlude: Modular Linear Algebra
The Polybius square is the last of our simple substitution ciphers for now. Before turning our attention to our first polygraphic cipher, let’s look at how linear algebra interacts with modular arithmetic. Since linear algebra is not a prerequisite for this class, we’ll focus on \(2 \times 2\) matrices for simplicity.1
\(2 \times 2\) matrices
Definition
A \(2 \times 2\) integer matrix (or just matrix for short) is a \(2 \times 2\) box of numbers \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\) where \(a, b, c, d\) are all integers.
The determinant of \(A\) is the integer \(\det(A) = ad - bc\).
The identity matrix is the matrix \(I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}\).
Suppose \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\) and \(B = \begin{bmatrix} a' & b' \\ c' & d' \end{bmatrix}\) are two matrices. Their product \(AB\) is defined to be: \[ AB = \begin{bmatrix} aa' + bc' & ab' + bd' \\ ca' + dc' & cb' + dd' \end{bmatrix} \]
For example, \(A = \begin{bmatrix} 3 & 2 \\ 1 & 7 \end{bmatrix}\) is a matrix and \(\det(A) = 3 \cdot 7 - 2 = 19\). Another matrix is \(B = \begin{bmatrix} 1 & 4 \\ 2 & 3 \end{bmatrix}\), and you should check by comparing against the definition above that \[ AB = \begin{bmatrix} 7 & 18 \\ 15 & 25 \end{bmatrix} \] You should also check that \[ BA = \begin{bmatrix} 7 & 30 \\ 9 & 25 \end{bmatrix}. \] Note that \(AB \neq BA\). In other words, matrix multiplication is not commutative!
Exercise Let \(A\) be a matrix (ie, a \(2 \times 2\) integer matrix). Show that \(AI = IA = A\).
Theorem (Multiplicativity of the Determinant) If \(A\) and \(B\) are matrices, then \(\det(I) = 1\) and \[ \det(AB) = \det(A)\det(B). \]
Definition
A vector is \(v\) a vertical column \(v = \begin{bmatrix} x \\ y \end{bmatrix}\) where \(x\) and \(y\) are both integers.
If \(A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\) is a matrix, the product \(Av\) is defined to be: \[ Av = \begin{bmatrix} ax + by \\ cx + dy \end{bmatrix} \]
matrix
function, and compute determinants using the det
function:
*
operator:
vector
function, and again
multiply using the *
operator:
Congruences and Inversion for Matrices
Definition Fix a positive integer \(n\) and suppose \(A\) and \(B\) are both matrices: \[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \quad\text{and}\quad B = \begin{bmatrix} a' & b' \\ c' & d' \end{bmatrix} \] We say that \(A \equiv B \pmod{n}\) if all four of the entries of the two matrices are congruent mod \(n\), ie, if all of the following are true: \[ \begin{aligned} a &\equiv a' &&\pmod{n} \\ b &\equiv b' &&\pmod{n} \\ c &\equiv c' &&\pmod{n} \\ d &\equiv d' &&\pmod{n} \end{aligned} \]
Definition A matrix \(A\) is invertible mod \(n\) if there exists a matrix \(X\) such that \(AX \equiv I \bmod{n}\). In this case, \(X\) is called an inverse of \(A\) mod \(n\). In symbols, we write \(X \equiv A^{-1} \pmod{n}\).
Matrix Inversion Theorem Suppose \[A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\] is a matrix. Then \(A\) is invertible if and only if \(\det(A)\) is invertible mod \(n\). Moreover, if \(e \equiv \det(A)^{-1} \pmod{n}\), then \[ X = \begin{bmatrix} ed & -eb \\ -ec & ea \end{bmatrix} \] is an inverse of \(A\) mod \(n\).
Suppose, for example, that we’re interested in the matrix
\[ A = \begin{bmatrix} 3 & 2 \\ 1 & 7 \end{bmatrix}. \]
We know that \(\det(A) = 19\) is invertible mod 26, so \(A\) is also invertible mod 26. We have \(19^{-1} \equiv 11 \pmod{26}\), so the formula for the inverse from the Matrix Inversion Theorem tells us that
\[ A^{-1} \equiv \begin{bmatrix} 11 \cdot 7 & -11 \cdot 2 \\ -11 \cdot 1 & 11 \cdot 3 \end{bmatrix} = \begin{bmatrix} 77 & -22 \\ -11 & 33 \end{bmatrix} \equiv \begin{bmatrix} 25 & 4 \\ 15 & 7 \end{bmatrix} \pmod{26}. \]
In other words,
\[ X = \begin{bmatrix} 25 & 4 \\ 15 & 7 \end{bmatrix} \]
is an inverse of \(A\) mod 26. Let’s check this:
\[ AX = \begin{bmatrix} 3 & 2 \\ 1 & 7 \end{bmatrix} \begin{bmatrix} 25 & 4 \\ 15 & 7 \end{bmatrix} = \begin{bmatrix} 105 & 26 \\ 130 & 53 \end{bmatrix} \equiv \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I \pmod{26}. \]
2.11. Hill Cipher
We now turn our attention to our first polygraphic cipher: the Hill cipher. We’ll focus on the digraphic case, which replaces 2 letters of plaintext at a time.2
To encrypt messages using a Hill cipher, we start by our usual
encoding. We also make the usual identification of the letters
A
through Z
with the numbers 0 through 25.
Suppose for example that we want to encrypt the following:
Y O U H A V E S A V E D U S A L L
24 14 20 7 0 21 4 18 0 21 4 3 20 18 0 11 11
The key for a Hill cipher is a matrix that is invertible mod 26. For example, we saw above that the matrix \[ A = \begin{bmatrix} 3 & 2 \\ 1 & 7 \end{bmatrix} \] has determinant 19, which is invertible mod 26, so the matrix inversion theorem tells us that \(A\) is an invertible matrix mod 26; we can use this as our key.
We are now going to go through our list of numbers, and replace each pair of numbers with the result of multiplying that pair by the matrix \(A\) mod 26. For example, for the first pair, 24 and 14, we make a vector by stacking these two numbers on top of each other:
\[ v = \begin{bmatrix} 24 \\ 14 \end{bmatrix} \]
We then compute
\[ Av \bmod{26} = \begin{bmatrix} 3 & 2 \\ 1 & 7 \end{bmatrix} \begin{bmatrix} 24 \\ 14 \end{bmatrix} \bmod{26} = \begin{bmatrix} 100 \\ 122 \end{bmatrix} \bmod{26} = \begin{bmatrix} 22 \\ 18 \end{bmatrix} \]
In other words, the first two numbers will be replaced by 22 and 18.
In other words, the first two letters of the message will be replaced by
W
and S
, respectively.
We then do the same thing with the next pair of numbers, 20 and 7. We
get 22 and 17, so the third and fourth letters are replaced by
W
and R
. We keep going like this until we
reach the very end. In this particular case, we have an odd number of
letters, so we should tack on a random letter at the end. For example,
maybe we tack on the letter Z
, which corresponds to 25. The
last pair of numbers is then 11 and 25. Multiplying by \(A\) mod 26 gives us 5 and 4, so we replace
the last two letters with F
and E
.
The net result is the ciphertext WSWRQRWAQRSZSQWZFE
.
As you might expect, to decrypt, we multiply pairs of numbers by the inverse of \(A\) mod 26. For example, suppose we start with the ciphertext we computed above. We look at the associated list of numbers:
W S W R Q R W A Q R S Z S Q W Z F E
22 18 22 17 16 17 22 0 16 17 18 25 18 16 22 25 5 4
We saw above that
\[ X = \begin{bmatrix} 25 & 4 \\ 15 & 7 \end{bmatrix} \]
is an inverse of \(A\), so now we’re going to repeat the same process we went through during encryption, but now using the matrix \(X\) in place of the matrix \(A\). We start with the vector
\[ v = \begin{bmatrix} 22 \\ 18 \end{bmatrix} \]
consisting of the first two numbers. We then compute
\[ Xv \bmod{26} = \begin{bmatrix} 25 & 4 \\ 15 & 7 \end{bmatrix}\begin{bmatrix} 22 \\ 18 \end{bmatrix} \bmod{26} = \begin{bmatrix} 622 \\ 456 \end{bmatrix} \bmod{26} = \begin{bmatrix} 24 \\ 14 \end{bmatrix}.\]
Since 24 and 14 correspond to Y
and O
respectively, the first two letters of the decrypted message will be
YO
(as we’d expect). We keep going like this through the
entire ciphertext to recover the plaintext. Of course, we’ll find the
message YOUHAVESAVEDUSALLZ
at the end, and Bob has to
remember that the Z
was probably a random letter tacked
onto the end.
Exercise Use the matrix \[ A = \begin{bmatrix} 3 & -1 \\ 2 & 5 \end{bmatrix} \] as the key for a Hill cipher.
- Encrypt the message
Go to Lake Lerna
. - Decrypt the message
RNCQYVFRRLZI
.
2.12. Playfair Cipher
The Playfair cipher is another digraphic cipher, like the Hill cipher we just described above. It was invented by Charles Wheatstone in 1854, but is named after Lord Playfair, who promoted its use. The encryption and decryption is a little complicated to describe, but there is no sophisticated mathematics involved.
The key for a Playfair cipher is a \(5
\times 5\) grid of letters, each letter appearing exactly once.
Of course, such a grid only fits 25 letters, but there are 26 letters in
the English alphabet! The typical solution is to treat I
and J
as the same letter, which is what we will do. (You
could instead imagine a variant where we use a \(6 \times 6\) grid that includes all 26
letters and all 10 digits, like the one we used for a Polybius square
but without the row and column labels.)
A convenient and “easy-to-remember” way of constructing such a grid
is to start with a secret key word. Suppose, for example, that our key
word is ALPHABET
. We then start filling out our grid by
writing out the letters of our key word across the rows, skipping over
any letter that we’ve already written down.
A | L | P | H | B |
E | T | |||
We then fill out the square by listing all of the remaining letters
of the alphabet, skipping over anything we’ve already written down and
remembering that I
and J
are the same.
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
We next encode our message by doing the following:
- As usual, we remove all non-alphabet characters and capitalize everything.
- Replace all instances of
J
withI
. - Group the letters into pairs.
- If there are any pairs where both letters are the same, we insert
the letter
X
in between the two letters of that pair and then regroup into pairs. - If there is an unpaired letter at the end, insert the letter
X
after it.
Suppose, for example, that we want to encrypt the message
hidden jewels in trees
. Here is what the encoding looks
like after each step:
HIDDENJEWELSINTHETREES
HIDDENIEWELSINTHETREES
HI DD EN IE WE LS IN TH ET RE ES
HI DX DE NI EW EL SI NT HE TR EE S
HI DX DE NI EW EL SI NT HE TR EX ES
HI DX DE NI EW EL SI NT HE TR EX ES
Note that we might have to apply rule 4 multiple times. We had to
apply rule 4 twice above (first to eliminate the DD
, and
then to replace the EE
).
To encrypt, we replace each pair with another pair using the grid by following these rules:
- (Row Rule) If both letters in the pair occur in the same row, replace each letter of the pair with the letter that appears immediately to its right (wrapping around to the left side of the row if needed).
- (Column Rule) If both letters in the pair occur in the same column, replace each letter of the pair with the letter that appears immediately below it (wrapping around to the top of the column if needed).
- (Rectangle Rule) Otherwise, the two letters define a rectangle inside the grid, and we replace each letter with the letter on the same row but the opposite side of that rectangle.
Returning to our example: we start with the pair HI
.
These are marked in bold below; they do not appear in the same row or
column, so the Rectangle Rule applies. The rectangle defined by these
two letters is indicated in black.
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
The letter in the same row as H
but the opposite side is
L
, and the letter in the same row as I
but the
opposite side is M
. Thus HI
gets replaced by
LM
.
Next we encrypt DX
. These also define a rectangle, as
depicted below, and get replaced by CY
.
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
Next we have DE
. These two letters are on the same row,
indicated in black below. We therefore apply the Row Rule and replace
each one with the letter immediately to its right, obtaining
FT
.
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
And then we encrypt NI
. These also lie on the same row,
indicated in black below, so we apply the row rule and replace each
letter with the letter immediately to its right. For the letter
N
, we have to wrap around to the beginning to
G
. Thus NI
gets replaced by
GK
.
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
A | L | P | H | B |
E | T | C | D | F |
G | I | K | M | N |
O | Q | R | S | U |
V | W | X | Y | Z |
finish encrypting the encoded text:
HI DX DE NI EW EL SI NT HE TR EX ES
LM CY FT GK __ __ __ __ __ __ __ __
Exercise Using the same grid, encode and encrypt the following messages:
Meet at aviary
Launch balloon at noon
To decrypt messages, we have to undo this process. We follow almost
the same three rules, except the Column Rule for decryption involves
replacing each letter that appears immediately above it, and
the Row Rule involves replacing each letter with the letter that appears
immediately to the left of it. The recipient also has to decode
the message by inferring which X
s should be removed and how
to correctly punctuate the message.
Exercise Using the same grid, decrypt and
decode the message OV CQ QI UG RY
.
Exercise There’s a problem with the 5-step encoding scheme described above: it can sometimes fail to produce encoded plaintext without any repeat pairs.
- Can you find the problem? Hint: Consider encoding
BOX
. - Describe how you might fix this problem.
- (Sage Exercise) Can you edit the SageCell above to implement your suggested fix?
2.13. Vignère Cipher
The Vignère cipher is our first example of polyalphabetic substitution: ie, a substitution cipher in which the substitution scheme changes over the course of the message. It is closely related to the Caesar cipher and was first described by Giovan Battista Bellaso in 1553; it carries the name of Blaise de Vignère due to a misattribution in the 1800s. The cipher resisted all attempts to break it for three centuries, until work of Friedrich Kasiski in 1863.
The Vignère cipher makes use of modular arithmetic and of the
correspondence between the letters A
through Z
and the numbers 0 through 25 that we’ve seen above. The key for a
Vignère cipher is a finite sequence of shifts. A convenient and
“easy-to-remember” way of constructing such a sequence is to have a
secret key word, and then associate each letter of that word with the
corresponding number to get the sequence of shifts. For example, if our
secret word is ASGARD
, the corresponding sequence of
numbers is \((0, 18, 6, 0, 17, 3)\),
because A
corresponds to 0, S
corresponds to
18, etc.
To encrypt a message, we start by encoding it in the usual way:
remove all non-alphabet characters and capitalize everything. Suppose,
for example, that we want to encrypt the message
Keep Loki Away
. We encode to KEEPLOKIAWAY
and
then we associate to each letter the corresponding number 0 through
25.
K E E P L O K I A W A Y
10 4 4 15 11 14 10 8 0 22 0 24
We will then perform addition mod 26 to each of these numbers. We’ll
use the first element of our key sequence for the first number, the
second for the second, and so forth; when we finish the key, we repeat
it from the beginning. We then convert those sums back to numbers using
the usual correspondence. Using the key \((0,
18, 6, 0, 17, 3)\) corresponding to the key word
ASGARD
from above, we have:
Encoded : K E E P L O K I A W A Y
Numbers (1) : 10 4 4 15 11 14 10 8 0 22 0 24
Key Word : A S G A R D A S G A R D
Key Number (2) : 0 18 6 0 17 3 0 18 6 0 17 3
(1) + (2) mod 26 : 10 22 10 15 2 17 10 0 6 22 17 1
Encrypted : K W K P C R K A G W R B
Thus KWKPCRKAGWRB
is our ciphertext. Notice that the
first E
of the plaintext was encrypted to W
,
while the second E
was encrypted to K
.
Similarly, the first A
was encrypted to G
,
while the second A
was encrypted to R
.
(Coincidentally, the first K
and the second K
are both encrypted to the letter K
.) This is what is meant
when we say that the Vignère cipher is polyalphabetic, or that the
substitution scheme changes over the course of the message.
Decryption, of course, works almost exactly the same way — except that we subtract mod 26 instead of adding!
Exercise Continue to use the key word
ASGARD
.
- Encrypt the message
Protect Odin from Fenrir
. - Decrypt the message
RSMNRUOCOSTRMATG
.
2.14. One-Time Pad
The one-time pad is a special case of the Vignère cipher where the key sequence is:
- never re-used,
- at least as long as the plaintext,
- “unrelated to the plaintext,” and
- “totally random,” in the sense that each number 0 through 25 is equally likely in each position of the key.
If you understand how the Vignère cipher works, you also understand the one-time pad works. The only noteworthy difference is that the key sequence must not be generated using a key words, because words won’t have the property that each letter is equally likely!
We’ll return to the one-time pad again. We’ll want to make precise what we mean by “unrelated to the plaintext” and “totally random” above, and we’ll see that it has a property called perfect secrecy, which means that the security of the one-time pad can be mathematically guaranteed!
Exercise While “perfect secrecy” sounds great, the one-time pad is generally not very practical. Why not? Hint. If Alice and Bob have a way of sharing a secret key that’s at least as long as the plaintext that they want to share…