Read The Code Book Online

Authors: Simon Singh

Tags: ##genre

The Code Book (11 page)

Table 4
A Vigenère square with the rows defined by the keyword WHITE highlighted. Encryption is achieved by switching between the five highlighted cipher alphabets, defined by W, H, I, T and E.

The great advantage of the Vigenère cipher is that it is impregnable to the frequency analysis described in
Chapter 1
. For example, a cryptanalyst applying frequency analysis to a piece of ciphertext would usually begin by identifying the most common letter in the ciphertext, which in this case is Z, and then assume that this represents the most common letter in English, e. In fact, the letter Z represents three different letters, d, r and s, but not e. This is clearly a problem for the cryptanalyst. The fact that a letter which appears several times in the ciphertext can represent a different plaintext letter on each occasion generates tremendous ambiguity for the cryptanalyst. Equally confusing is the fact that a letter which appears several times in the plaintext can be represented by different letters in the ciphertext. For example, the letter o is repeated in troops, but it is substituted by two different letters—the oo is enciphered as HS.

As well as being invulnerable to frequency analysis, the Vigenère cipher has an enormous number of keys. The sender and receiver can agree on any word in the dictionary, any combination of words, or even fabricate words. A cryptanalyst would be unable to crack the message by searching all possible keys because the number of options is simply too great.

Vigenère’s work culminated in his
Traicté des Chiffres
(“A Treatise on Secret Writing”), published in 1586. Ironically, this was the same year that Thomas Phelippes was breaking the cipher of Mary Queen of Scots. If only Mary’s secretary had read this treatise, he would have known about the Vigenère cipher, Mary’s messages to Babington would have baffled Phelippes, and her life might have been spared.

Because of its strength and its guarantee of security, it would seem natural that the Vigenère cipher would be rapidly adopted by cipher secretaries around Europe. Surely they would be relieved to have access, once again, to a secure form of encryption? On the contrary, cipher secretaries seem to have spurned the Vigenère cipher. This apparently flawless system would remain largely neglected for the next two centuries.

From Shunning Vigenère to the Man in the Iron Mask

The traditional forms of substitution cipher, those that existed before the Vigenère cipher, were called monoalphabetic substitution ciphers because they used only one cipher alphabet per message. In contrast, the Vigenère cipher belongs to a class known as
polyalphabetic
, because it employs several cipher alphabets per message. The polyalphabetic nature of the Vigenère cipher is what gives it its strength, but it also makes it much more complicated to use. The additional effort required in order to implement the Vigenère cipher discouraged many people from employing it.

For many seventeenth-century purposes, the monoalphabetic substitution cipher was perfectly adequate. If you wanted to ensure that your servant was unable to read your private correspondence, or if you wanted to protect your diary from the prying eyes of your spouse, then the old-fashioned type of cipher was ideal. Monoalphabetic substitution was quick, easy to use, and secure against people unschooled in cryptanalysis. In fact, the simple monoalphabetic substitution cipher endured in various forms for many centuries (see
Appendix D
). For more serious applications, such as military and government communications, where security was paramount, the straightforward monoalphabetic cipher was clearly inadequate. Professional cryptographers in combat with professional cryptanalysts needed something better, yet they were still reluctant to adopt the polyalphabetic cipher because of its complexity. Military communications, in particular, required speed and simplicity, and a diplomatic office might be sending and receiving hundreds of messages each day, so time was of the essence. Consequently, cryptographers searched for an intermediate cipher, one that was harder to crack than a straightforward monoalphabetic cipher, but one that was simpler to implement than a polyalphabetic cipher.

The various candidates included the remarkably effective
homophonic substitution cipher
. Here, each letter is replaced with a variety of substitutes, the number of potential substitutes being proportional to the frequency of the letter. For example, the letter a accounts for roughly 8 per cent of all letters in written English, and so we would assign eight symbols to represent it. Each time a appears in the plaintext it would be replaced in the ciphertext by one of the eight symbols chosen at random, so that by the end of the encipherment each symbol would constitute roughly 1 per cent of the enciphered text. By comparison, the letter b accounts for only 2 per cent of all letters, and so we would assign only two symbols to represent it. Each time b appears in the plaintext either of the two symbols could be chosen, and by the end of the encipherment each symbol would also constitute roughly 1 per cent of the enciphered text. This process of allotting varying numbers of symbols to act as substitutes for each letter continues throughout the alphabet, until we get to z, which is so rare that it has only one symbol to act as a substitute. In the example given in
Table 5
, the substitutes in the cipher alphabet happen to be two-digit numbers, and there are between one and twelve substitutes for each letter in the plain alphabet, depending on each letter’s relative abundance.

We can think of all the two-digit numbers that correspond to the plaintext letter a as effectively representing the same sound in the ciphertext, namely the sound of the letter a. Hence the origin of the term homophonic substitution,
homos
meaning “same” and
phonos
meaning “sound” in Greek. The point of offering several substitution options for popular letters is to balance out the frequencies of symbols in the ciphertext. If we enciphered a message using the cipher alphabet in
Table 5
, then every number would constitute roughly 1 per cent of the entire text. If no symbol appears more frequently than any other, then this would appear to defy any potential attack via frequency analysis. Perfect security? Not quite.

Table 5
An example of a homophonic substitution cipher. The top row represents the plain alphabet, while the numbers below represent the cipher alphabet, with several options for frequently occurring letters.

The ciphertext still contains many subtle clues for the clever cryptanalyst. As we saw in
Chapter 1
, each letter in the English language has its own personality, defined according to its relationship with all the other letters, and these traits can still be discerned even if the encryption is by homophonic substitution. In English, the most extreme example of a letter with a distinct personality is the letter q, which is only followed by one letter, namely u. If we were attempting to decipher a ciphertext, we might begin by noting that q is a rare letter, and is therefore likely to be represented by just one symbol, and we know that u, which accounts for roughly 3 per cent of all letters, is probably represented by three symbols. So, if we find a symbol in the ciphertext that is only ever followed by three particular symbols, then it would be sensible to assume that the first symbol represents q and the other three symbols represent u. Other letters are harder to spot, but are also betrayed by their relationships to one another. Although the homophonic cipher is breakable, it is much more secure than a straightforward monoalphabetic cipher.

A homophonic cipher might seem similar to a polyalphabetic cipher inasmuch as each plaintext letter can be enciphered in many ways, but there is a crucial difference, and the homophonic cipher is in fact a type of monoalphabetic cipher. In the table of homophones shown above, the letter a can be represented by eight numbers. Significantly, these eight numbers represent only the letter a. In other words, a plaintext letter can be represented by several symbols, but each symbol can only represent one letter. In a polyalphabetic cipher, a plaintext letter will also be represented by different symbols, but, even more confusingly, these symbols will represent different letters during the course of an encipherment.

Perhaps the fundamental reason why the homophonic cipher is considered monoalphabetic is that once the cipher alphabet has been established, it remains constant throughout the process of encryption. The fact that the cipher alphabet contains several options for encrypting each letter is irrelevant. However, a cryptographer who is using a polyalphabetic cipher must continually switch between distinctly different cipher alphabets during the process of encryption.

By tweaking the basic monoalphabetic cipher in various ways, such as adding homophones, it became possible to encrypt messages securely, without having to resort to the complexities of the polyalphabetic cipher. One of the strongest examples of an enhanced monoalphabetic cipher was the Great Cipher of Louis XIV. The Great Cipher was used to encrypt the king’s most secret messages, protecting details of his plans, plots and political schemings. One of these messages mentioned one of the most enigmatic characters in French history, the Man in the Iron Mask, but the strength of the Great Cipher meant that the message and its remarkable contents would remain undeciphered and unread for two centuries.

The Great Cipher was invented by the father-and-son team of Antoine and Bonaventure Rossignol. Antoine had first come to prominence in 1626 when he was given a coded letter captured from a messenger leaving the besieged city of Réalmont. Before the end of the day he had deciphered the letter, revealing that the Huguenot army which held the city was on the verge of collapse. The French, who had previously been unaware of the Huguenots’ desperate plight, returned the letter accompanied by a decipherment. The Huguenots, who now knew that their enemy would not back down, promptly surrendered. The decipherment had resulted in a painless French victory.

The power of codebreaking became obvious, and the Rossignols were appointed to senior positions in the court. After serving Louis XIII, they then acted as cryptanalysts for Louis XIV, who was so impressed that he moved their offices next to his own apartments so that Rossignol
père et fils
could play a central role in shaping French diplomatic policy. One of the greatest tributes to their abilities is that the word
rossignol
became French slang for a device that picks locks, a reflection of their ability to unlock ciphers.

The Rossignols’ prowess at cracking ciphers gave them an insight into how to create a stronger form of encryption, and they invented the so-called Great Cipher. The Great Cipher was so secure that it defied the efforts of all enemy cryptanalysts attempting to steal French secrets. Unfortunately, after the death of both father and son, the Great Cipher fell into disuse and its exact details were rapidly lost, which meant that enciphered papers in the French archives could no longer be read. The Great Cipher was so strong that it even defied the efforts of subsequent generations of codebreakers.

Historians knew that the papers encrypted by the Great Cipher would offer a unique insight into the intrigues of seventeenth-century France, but even by the end of the nineteenth century they were still unable to decipher them. Then, in 1890, Victor Gendron, a military historian researching the campaigns of Louis XIV, unearthed a new series of letters enciphered with the Great Cipher. Unable to make sense of them, he passed them on to Commandant Étienne Bazeries, a distinguished expert in the French Army’s Cryptographic Department. Bazeries viewed the letters as the ultimate challenge, and he spent the next three years of his life attempting to decipher them.

The encrypted pages contained thousands of numbers, but only 587 different ones. It was clear that the Great Cipher was more complicated than a straightforward substitution cipher, because this would require just 26 different numbers, one for each letter. Initially, Bazeries thought that the surplus of numbers represented homophones, and that several numbers represented the same letter. Exploring this avenue took months of painstaking effort, all to no avail. The Great Cipher was not a homophonic cipher.

Other books

The Bad Girls' Club by O'Halloran, Kathryn
The Merciless II by Danielle Vega
Spider Lake by Gregg Hangebrauck
Blistered Kind Of Love by Angela Ballard, Duffy Ballard
Grayson by Lynne Cox
Siege Of the Heart by Elise Cyr
The Senator’s Daughter by Christine Carroll