The Information (53 page)

Read The Information Online

Authors: James Gleick

Tags: #Non-Fiction

 
 

If Alice (A) and Bob (B) both say they generated their strings by flipping a coin, no one will ever believe Alice. The strings are surely not equally random. Classical probability theory offers no solid reason for claiming that B is more random than A, because a random process
could
produce either string. Probability is about ensembles, not individual events. Probability theory treats events statistically. It does not like questions in the form “How likely was that to happen?” If it happened, it happened.

To Claude Shannon, these strings would look like messages. He would ask,
How much information
does each string contain? On their face, they both contain fifty bits. A telegraph operator charging by the digit would measure the length of the messages and give Alice and Bob the same bill. Then again, the two messages seem to differ profoundly. Message A immediately becomes boring: once you see the pattern, further repetitions provide no new information. In message B, every bit is as valuable as every other. Shannon’s first formulation of information theory treated messages statistically, as choices from the ensemble of all possible messages—in the case of A and B, 2
50
of them. But Shannon also considered redundancy within a message: the pattern, the regularity, the order that makes a message compressible. The more regularity in a message, the more predictable it is. The more predictable, the more redundant. The more redundant a message is, the less information it contains.

The telegraph operator sending message A has a shortcut: he can transmit something like “Repeat ‘01’ twenty-five times.” For longer messages with easy patterns, the savings in keystrokes becomes enormous. Once the pattern is clear, the extra characters are free. The operator for message B must soldier on the hard way, sending every character, because every character is a complete surprise; every character costs one bit. This pair of questions—
how random
and
how much information
—turn out to be one and the same. They have a single answer.

Chaitin was not thinking about telegraphs. The device he could not get out of his head was the Turing machine—that impossibly elegant abstraction, marching back and forth along its infinite paper tape, reading and writing symbols. Free from all the real world’s messiness, free
from creaking wheel-work and finical electricity, free from any need for speed, the Turing machine was the ideal computer. Von Neumann, too, had kept coming back to Turing machines. They were the ever-handy lab mice of computer theory. Turing’s
U
had a transcendent power: a universal Turing machine can simulate any other digital computer, so computer scientists can disregard the messy details of any particular make or model. This is liberating.

Claude Shannon, having moved from Bell Labs to MIT, reanalyzed the Turing machine in 1956. He stripped it down to the smallest possible skeleton, proving that the universal computer could be constructed with just two internal states, or with just two symbols, 0 and 1, or blank and nonblank. He wrote his proof in words more pragmatic than mathematical: he described exactly how the two-state Turing machine would step left and right, “bouncing” back and forth to keep track of the larger numbers of states in a more complex computer. It was all very intricate and specific, redolent of Babbage. For example:

When the reading head moves, the state information must be transferred to the next cell of the tape to be visited using only two internal states in machine B. If the next state in machine A is to be (say) state 17 (according to some arbitrary numbering system) this is transferred in machine B by “bouncing” the reading head back and forth between the old cell and the new one 17 times (actually 18 trips to the new cell and 17 back to the old one).

 
 

The “bouncing operation” carries the information from cell to cell, and the cells act as “transmitters” and “controllers.”

Turing had titled his great paper “On Computable Numbers,” but of course the real focus was on
un
computable numbers. Could uncomputable numbers and random numbers be related? In 1965 Chaitin was an undergraduate at the City College of New York, writing up a discovery he hoped to submit to a journal; it would be his first publication. He began, “In this paper the Turing machine is regarded as a general purpose
computer and some practical questions are asked about programming it.” Chaitin, as a high-school student in the Columbia Science Honors Program, had the opportunity to practice programming in machine language on giant IBM mainframes, using decks of punched cards—one card for each line of a program. He would leave his card deck in the computer center and come back the next day for the program’s output. He could run Turing machines in his head, too:
write 0, write 1, write blank, shift tape left, shift tape right.
… The universal computer gave him a nice way to distinguish between numbers like Alice and Bob’s A and B. He could write a program to make a Turing machine print out “010101 …” a million times, and he could write down the length of that program—quite short. But given a million random digits—no pattern, no regularity, nothing special at all—there could be no shortcut. The computer program would have to incorporate the entire number. To make the IBM mainframe print out those million digits, he would have to put the whole million digits into the punched cards. To make the Turing machine do it, he would still need the million digits for input.

Here is another number (in decimal this time):

C: 3.141
592
653
589
793
238
462
643
383
279
502
884
197
169
399

Other books

Rose of Betrayal by Elizabeth Lowe
The Piper by Lynn Hightower
Beck: Hollywood Hitman by Maggie Marr
Dandelion Clocks by Rebecca Westcott
Save the Last Vamp for Me by Gayla Drummond
Sleeper Cell by Alan Porter