Read The Code Book Online

Authors: Simon Singh

Tags: ##genre

The Code Book (62 page)

First published in France as
La Disparition
by Editions Denöel in 1969, and in Great Britain by Harvill in 1994. Copyright © by Editions Denöel 1969; in the English translation © Harvill 1994. Reproduced by permission of the Harvill Press.

Appendix B

Some Elementary Tips for Frequency Analysis

(1) Begin by counting up the frequencies of all the letters in the ciphertext. About five of the letters should have a frequency of less than 1 per cent, and these probably represent j, k, q, x and z. One of the letters should have a frequency greater than 10 per cent, and it probably represents e. If the ciphertext does not obey this distribution of frequencies, then consider the possibility that the original message was not written in English. You can identify the language by analyzing the distribution of frequencies in the ciphertext. For example, typically in Italian there are three letters with a frequency greater than 10 per cent, and nine letters have frequencies less than 1 per cent. In German, the letter e has the extraordinarily high frequency of 19 per cent, so any ciphertext containing one letter with such a high frequency is quite possibly German. Once you have identified the language you should use the appropriate table of frequencies for that language for your frequency analysis. It is often possible to unscramble ciphertexts in an unfamiliar language, as long as you have the appropriate frequency table.
(2) If the correlation is sympathetic with English, but the plaintext does not reveal itself immediately, which is often the case, then focus on pairs of repeated letters. In English the most common repeated letters are ss, ee, tt, ff, ll, mm and oo. If the ciphertext contains any repeated characters, you can assume that they represent one of these.
(3) If the ciphertext contains spaces between words, then try to identify words containing just one, two or three letters. The only one-letter words in English are a and I. The commonest two-letter words are of, to, in, it, is, be, as, at, so, we, he, by, or, on, do, if, me, my, up, an, go, no, us, am. The most common three-letter words are the and and.
(4) If possible, tailor the table of frequencies to the message you are trying to decipher. For example, military messages tend to omit pronouns and articles, and the loss of words such as I, he, a and the will reduce the frequency of some of the commonest letters. If you know you are tackling a military message, you should use a frequency table generated from other military messages.
(5) One of the most useful skills for a cryptanalyst is the ability to identify words, or even entire phrases, based on experience or sheer guesswork. Al-Khalīl, an early Arabian cryptanalyst, demonstrated this talent when he cracked a Greek ciphertext. He guessed that the ciphertext began with the greeting “In the name of God.” Having established that these letters corresponded to a specific section of ciphertext, he could use them as a crowbar to prize open the rest of the ciphertext. This is known as a crib.
(6) On some occasions the commonest letter in the ciphertext might be E, the next commonest could be T, and so on. In other words, the frequency of letters in the ciphertext already matches those in the frequency table. The E in the ciphertext appears to be a genuine e, and the same seems to be true for all the other letters, yet the ciphertext looks like gibberish. In this case you are faced not with a substitution cipher, but with a transposition cipher. All the letters do represent themselves, but they are in the wrong positions.

Cryptanalysis
by Helen Fouché Gaines (Dover) is a good introductory text. As well as giving tips, it also contains tables of letter frequencies in different languages, and provides lists of the most common words in English.

Appendix C

The So-called Bible Code

In 1997
The Bible Code
by Michael Drosnin caused headlines around the world. Drosnin claimed that the Bible contains hidden messages which could be discovered by searching for equidistant letter sequences (EDLSs). An EDLS is found by taking any text, picking a particular starting letter, then jumping forward a set number of letters at a time. So, for example, with this paragraph we could start with the “M” in Michael and jump, say, five spaces at a time. If we noted every fifth letter, we would generate the EDLS mesahirt.…

Although this particular EDLS does not contain any sensible words, Drosnin described the discovery of an astonishing number of Biblical EDLSs that not only form sensible words, but result in complete sentences. According to Drosnin, these sentences are biblical predictions. For example, he claims to have found references to the assassinations of John F. Kennedy, Robert Kennedy and Anwar Sadat. In one EDLS the name of Newton is mentioned next to gravity, and in another Edison is linked with the lightbulb. Although Drosnin’s book is based on a paper published by Doron Witzum, Eliyahu Rips and Yoav Rosenberg, it is far more ambitious in its claims, and has attracted a great deal of criticism. The main cause of concern is that the text being studied is enormous: in a large enough text, it is hardly surprising that by varying both the starting place and the size of the jump, sensible phrases can be made to appear.

Brendan McKay at the Australian National University tried to demonstrate the weakness of Drosnin’s approach by searching for EDLSs in
Moby Dick
, and discovered thirteen statements pertaining to assassinations of famous people, including Trotsky, Gandhi and Robert Kennedy. Furthermore, Hebrew texts are bound to be particularly rich in EDLSs, because they are largely devoid of vowels. This means that interpreters can insert vowels as they see fit, which makes it easier to extract predictions.

Appendix D

The Pigpen Cipher

The monoalphabetic substitution cipher persisted through the centuries in various forms. For example, the pigpen cipher was used by Freemasons in the 1700s to keep their records private, and is still used today by schoolchildren. The cipher does not substitute one letter for another, rather it substitutes each letter for a symbol according to the following pattern.

To encrypt a particular letter, find its position in one of the four grids, then sketch that portion of the grid to represent that letter. Hence:

If you know the key, then the pigpen cipher is easy to decipher. If not, then it is easily broken by:

Appendix E

The Playfair Cipher

The Playfair cipher was popularized by Lyon Playfair, first Baron Playfair of St. Andrews, but it was invented by Sir Charles Wheatstone, one of the pioneers of the electric telegraph. The two men lived close to each other, either side of Hammersmith Bridge, and they often met to discuss their ideas on cryptography.

The cipher replaces each pair of letters in the plaintext with another pair of letters. In order to encrypt and transmit a message, the sender and receiver must first agree on a keyword. For example, we can use Wheatstone’s own name, CHARLES, as a keyword. Next, before encryption, the letters of the alphabet are written in a 5 × 5 square, beginning with the keyword, and combining the letters I and J into a single element:

Next, the message is broken up into pairs of letters, or digraphs. The two letters in any digraph should be different, achieved in the following example by inserting an extra x between the double m in hammersmith, and an extra x is added at the end to make a digraph from the single final letter:

Encryption can now begin. All the digraphs fall into one of three categories—both letters are in the same row, or the same column, or neither. If both letters are in the same row, then they are replaced by the letter to the immediate right of each one; thus mi becomes NK. If one of the letters is at the end of the row, it is replaced by the letter at the beginning; thus ni becomes GK. If both letters are in the same column, they are replaced by the letter immediately beneath each one; thus ge becomes OG. If one of the letters is at the bottom of the column, then it is replaced by the letter at the top; thus ve becomes CG.

If the letters of the digraph are neither in the same row nor the same column, the encipherer follows a different rule. To encipher the first letter, look along its row until you reach the column containing the second letter; the letter at this intersection then replaces the first letter. To encipher the second letter, look along its row until you reach the column containing the first letter; the letter at this intersection replaces the second letter. Hence, me becomes GD, and et becomes DO. The complete encryption is:

Other books

Brick Lane by Monica Ali
Immediate Action by Andy McNab
Six Years by Harlan Coben
Too Good to Be True by Kristan Higgins
Moonlight Kin 4: Tristan by Jordan Summers
Sweet Surrender by Kami Kayne
Play Me Wild by Tracy Wolff
Wreck and Order by Hannah Tennant-Moore
Kyle’s Bargain by Katherine Kingston