Read Mathematics and the Real World Online
Authors: Zvi Artstein
Years before the KSA system was published, the mathematician Michael Rabin of the Hebrew University of Jerusalem proposed an algorithm whose result was a declaration that either a number was not prime, or it was prime with a probability that could be predetermined. The inference of the approximation proposed by Rabin is of the type that Archimedes proposed for approximate calculations. That is to say, you, the consumer, choose the level of probability you would like, and the computer will give you the answer at that level of probability. The computation will take longer the smaller you want the probability of error to be. The system is practical, however, also for imaginary levels of probability. For example, you can determine that a number shall not be declared prime if the probability that it is not prime is greater than
to the power of 200. As we have noted, the human brain has difficulty in imagining an occurrence with such a low probability, and in everyday decisions people ignore such events. So if the computer tells you that a number is prime unless something with a probability of less than
to the power 200 happens, for all purposes it is reasonable to accept that the number is prime. The system is not applicable if someone wants absolute mathematical certainty. But in that case there will generally be no reasonable way of providing a solution at all.
Rabin was the winner of the Turing Award in 1976 for his work on automata theory and later became the pioneer of probabilistic algorithms. He based his primeness algorithm on one proposed in 1976 by Gary Miller of the University of California, Berkeley, that checks deterministically
whether a number is prime. The algorithm is known as the Miller-Rabin algorithm. For those interested, we will describe one instance of it in detail and then briefly explain the general case (skipping the paragraph will not detract from the clarity of the text that follows).
Assume we want to discover whether a number of the form
n
= 2
m
+ 1 is prime, where
m
is not an even number. A well-known theorem in number theory states that for every integer
k
between 1 and
n
, the following holds. If
n
is prime, then either the remainder when
k
m
is divided by
n
is 1, or the remainder when
k
2
m
is divided by
n
is
n
– 1. Hence, if we choose such
k
and the two remainders above are not upheld, we are certain that
n
is not prime. It could happen, however, that the number is not prime and both the above requirements are fulfilled, so that the fact that the requirements are satisfied does not enable the conclusion to be drawn that the number is prime. In such a situation another well-known mathematical result is used, and that is if the number is not prime, at least half of the numbers between 1 and
n
will not fulfill the remainder conditions. Thus, if
k
is chosen randomly and the two conditions are satisfied, this means that the probability that the number is not prime is less than a half. If we randomly choose another number between 1 and
n
and again the two conditions are satisfied, the probability that the number
n
is not prime reduces to a quarter. If this is repeated 200 times and the two remainder conditions are fulfilled each time, the probability that
n
is not prime is less than
to the power 200. Thus, in no more than 200 trials, we either know with certainty that the number is not prime, or we declare that the number is prime, with the chance of an error no more than
to the power 200. Performing two hundred trials is trivial for a computer. The general case is somewhat more complex, but not very much so. Every odd number can be written in the form
n
= 2
a
m
+ 1, where
m
is an odd number. The mathematical theorem mentioned above states in the general case that for every integer
k
between 1 and
n
, either the remainder when
k
m
is divided by
n
is 1, or the remainder when
k
to the power 2
b
m
is divided by
n
is
n
– 1 for at least one number
b
between 1 and
a
. Hence the two remainder conditions in the previous case become
a
+ 1 conditions in the general case. This too is a simple operation for a computer to perform.
Probabilistic algorithms led to another interesting conceptual development, and that is the integration of interactive proofs and zero-knowledge proofs. The idea was proposed and developed in an article written in 1982 by Shafi Goldwasser, Silvio Micali, and Charles Rackoff, all then at the Massachusetts Institute of Technology (MIT). Goldwasser, a faculty member of the Weizmann Institute, and Micali were jointly awarded the Turing Award in 2012. Also here the goal is reached subject to a probabilistic error, and here too the probability of error can be extremely small, and the estimate of the
error is an absolute estimate defined mathematically. We will learn about the method from an example.
The United States is divided into about 3,000 counties (the exact number changes). Assume that you know how to color the map of the United States using only three colors so that every county will be in one color, and no two counties with a common border (not just a corner) will be in the same color. Assume further that you want to prove that you are capable of coloring the map of the United States in this way, but without giving the least hint of how you are going to do it. Can it be done? Even if you show the colors of two nonadjacent counties, you have divulged some information, because we will then know if you have colored them with the same color or different colors. Here is the interactive method we referred to. We decide on the three colors you are to use, say yellow, green, and red. You color the map of the United States but do not show us the result. We select two neighboring counties, say Los Angeles and San Bernardino. You show us the colors in which you have colored them, say yellow and green. As they are not the same color, we have not caught you in an error. This just means that we did not catch you coloring two contiguous counties with the same color, yet you still may not be able to color the whole map in accordance with the conditions. If, however, we chose the two adjacent counties randomly, we obtained a certain measure of confidence that you can in fact color the map as required, because the probability that we will find an error in the two counties you showed us (that is, they are in the same color) is greater than say one in 20,000 (as there are not more than 20,000 pairs of counties with common borders). At the same time, you have not revealed anything about the way you colored the map, as we knew in advance that two neighboring counties will be in different colors. In the second stage, you repeat the coloring of the map, but you change the colors. Again we randomly select two adjacent counties, and you show us the colors you have used for them. If they are not the same color, the degree of our confidence in your ability to carry out the task as required will rise. Yet again we have learned nothing about the method you used to color the map, as the colors were changed. For example, even if Los Angeles was drawn by chance again, this time you may have colored it red.
We can repeat this procedure many times until we will have as much confidence as we want in your ability to complete the job properly, and you still will have divulged nothing about how you do it.
Let us review another example. You wish to convince me that you have completed the sudoku shown above, without giving me any indication of the solution. For each digit from 1 to 9 you substitute a color, without telling me what color represents what digit, and without showing me the completed, colored sudoku. I select one of the nine columns or one of the nine rows or one of the 3 × 3 sub-squares at random, and you then show me your colored version of the column, row, or square I have chosen. If you have not completed the sudoku properly, I may not see nine different colors in my selection, and that will have a probability of at least
. Repeating this procedure two hundred times, with your choosing different colors each time and with me seeing nine different colors in the column, row, or square I select, will lead me to the conclusion that the probability that you did not complete the sudoku properly is less than
to the power of 200. Yet the solution is not complete. You have to convince me that in the coloring process you did not change the numbers revealed in the sudoku at the outset. That can be achieved if at any stage, at random, I can ask that you show me the numbers represented by the colors in the squares where the revealed numbers were situated. If the numbers match that constraint, the chance that you cheated remains as low as it was.