The Journey of Man: A Genetic Odyssey (9 page)

Read The Journey of Man: A Genetic Odyssey Online

Authors: Spencer Wells

Tags: #Non-Fiction

The evolutionary term for a species like the albatross is panmictic – meaning that each individual has the potential to mate with any other individual in the species. While the albatross may fly over a significant part of the world’s oceans during its lifetime, it doesn’t put down roots anywhere but in its own home town. Humans aren’t like this. When we move, we tend to mate with people living in the new neighbourhood. If we plot the distance between birthplaces of married couples over time, we see that until quite recently – the past hundred years or so – this distance was pretty small. My wife and I were born about as far apart as you can get – Atlanta, Georgia, and Hong Kong – but this would have been virtually unheard of a few generations ago. She would have ended up with someone living on Kowloon or the Mid-Levels, while I would have gotten hitched to a Southern belle.

The effect of this localization of mating habits is to make people living in the same region more similar to each other over time, and to increase the divergence between localities. If you met your third cousin, would you recognize him or her as a relative? If you didn’t, and you
hit it off and had a child together, what would that mean? Genetically, it would mean that your son or daughter would have slightly less than two unrelated parents, since you would share some of your genome with your mate. This means that the multiplier in our ancestor calculation would be less than two – providing us with the answer to our mathematical conundrum. Because people historically have tended to choose their mates from those living close by, they have inevitably ended up with someone they are related to – however distantly. This has the effect of making people living in the same region more similar to each other.

In some regions, of course, the degree of relatedness is quite high, with first-cousin marriages fairly common – we all have our favourite scapegoats for anecdotes about ‘inbreeding’. But even if the degree of relatedness isn’t high, over time the slight degree of inbreeding that has occurred in all traditional societies will tend to produce a distinctive pattern in the frequency of polymorphisms in that region. So, in the same way that you are uniquely defined by your polymorphisms as being the child of your parents, so too are people from a particular part of the world carrying a genetic signal of their geographic origin. It is these signals that we study as population geneticists – not simply the species unity of our common ancestors, Adam and Eve, shared by all of us, but the additional ‘regional unities’ that make up the patchwork quilt that is modern humanity. As we saw from Dick Lewontin’s analysis, these signals are quite weak – but they are there. The trick is to find the polymorphisms that do unite us into regional groups, and to do this we need to spend a bit more time in the lab.

… nor any drop to drink

Zuckerkandl and Pauling’s insight into diverging molecules as the timekeepers of evolution, and their utility for peering back into the past to see the common ancestor, gave us a clue about how to interpret the mass of mitochondrial data and infer the existence of Eve. Of course, since the Y-chromosome is also free from recombination, the same applies to it. By following the pathway defined by Y polymorphisms, we can reach Adam easily and quickly as well – all we need
are the polymorphisms. And here the Y plays a trump card, because until quite recently it looked like there just weren’t that many.

In 1994 Rob Dorit, Hiroshi Akashi and Walter Gilbert (the same person who co-discovered DNA sequencing in the 1970s) published an odd paper in the prestigious scientific journal
Science.
It was odd not because of what they had found, but because of what they hadn’t. Titled ‘Absence of polymorphism at the
ZFY
locus on the human Y-chromosome’, it described an analysis of thirty-eight men from around the world as part of a focused effort to discover polymorphisms on their Y-chromosomes. Although a few polymorphisms had been identified on the Y – the first were discovered independently by Myriam Casanova and Gerard Lucotte in 1985 – there were far fewer than were known for any other chromosome. The surprising result of the Dorit survey was that there was no variation on the human Y-chromosome in the region examined. There was not a single DNA sequence variant detected, which implied that all of the men shared a very recent common ancestor. But since there was no variation detected, it was impossible to say when this person may have lived. On the face of it, they all could have had the same father – a Casanova of a man who had sown his oats all over the world. However, owing to the relatively small amount of DNA they studied – around 700 nucleotides in length – and the small number of men, it was also possible that they had simply been unlucky and chosen a region that didn’t vary in those particular Y-chromosomes. For this reason, the estimate of the date of the most recent common ancestor of the men – in other words, Adam – was between 0 and 800,000 years ago. This provided no new insights into human origins and migrations, other than to serve as a deterrent to researchers who wanted to study the population genetics of the Y.

A few polymorphisms did turn up over the next few years, and Michael Hammer of the University of Arizona was able to find enough diversity to place Adam in Africa within the past 200,000 years – confirming the mitochondrial results and, tantalizingly, setting the stage for an ancestral tryst on the veldt. The total number of informative Y polymorphisms was still quite small, however. The time had come for a scaling-up of the search for diversity, and again, the San Francisco Bay area of California was to provide the right setting.

Under pressure

Peter Underhill started his scientific career studying marine biology in California in the late 1960s, ultimately obtaining a PhD from the University of Delaware in 1981. He then returned to California, taking a leap into the emerging field of biotechnology, doing things like designing enzymes for use in molecular biology research. Most importantly, he was absorbing the dizzying array of emerging technologies that geneticists were developing at the time. This was a heady time for the fledgling biotech industry, and the San Francisco area was the epicentre of the revolution promised by recombinant DNA. Cutting and splicing genes became the biological counterpart to the expanding computer industry in Silicon Valley and the surrounding towns.

In 1991, tired of the commercial world, he applied for a position as a research associate in Luca Cavalli-Sforza’s laboratory at Stanford University. After convincing Luca that he would fit into the close-knit and collaborative group, he was hired. Peter started off in the lab by sequencing mtDNA, but he soon became interested in the Y-chromosome. The Cavalli-Sforza laboratory at that time was a very exciting place to be, with a real sense of ‘blazing a new path’ in the field – I count myself lucky to have been a postdoctoral fellow there at the time. New methods of statistical and genetic analysis were being developed almost weekly, and the intellectual climate was impeccable. Nearly all of the major figures in human population genetics spent some time at Stanford during the 1990s – among them students and postdoctoral research fellows such as David Goldstein, Mark Seielstad and Li Jin, all of whom we will encounter later in the book. But it was an analytical chemist, oddly enough, who was to have the greatest impact on our story. To explain why, we need to know a little bit about the molecule that makes up our genome.

One of the main tools in the geneticist’s technical arsenal is the ability to separate fragments of DNA on the basis of size. The DNA inside your cells, like the proteins, is a linear chain of building blocks known as nucleotide bases. The information is encoded in the sequence of bases that make up DNA, rather like the amino acids that make up a protein. Unlike proteins, however, DNA has only four building
blocks, called
nucleotide bases
: adenine (A), cytosine (C), guanine (G) and thymine (T). The information they encode – the instruction manual to build you – is contained in the particular sequence of these four nucleotides. In the same way that Morse code can convey a huge amount of information with only dots and dashes, so too can DNA encode the biological essence of an organism in the pattern of nucleotides. With 3 billion of them to work with, that’s a lot of data.

Techniques that separate a mixture of molecules on the basis of their
size
can actually be used as a method of inferring the
sequence
of nucleotides in a DNA molecule. This is because biochemical techniques can generate DNA fragments of a particular length based on their sequence. After the fragments are generated, they can be separated by passing them through a gelatine-like matrix in the presence of an electric field. Because DNA is negatively charged, the fragments migrate toward the positively charged end of the matrix – at the molecular level, opposites really do attract. Interestingly, by doing this in a gel matrix the fragments will be retarded in their movement, because they have to navigate through the maze of tiny channels in the gel. The extent to which they are retarded depends on their length – long molecules are retarded to a greater degree than short ones, since they have more material to squeeze through the matrix channels. All very complicated in theory, but it works beautifully in practice. This technique, known as sequencing, is the basis of almost every important genetic discovery that has been made in the past thirty years. The sequencing of the human genome, for instance, involved the application of this technique tens of millions of times – not a terribly exciting task, but effective.

One problem with sequencing is that it is quite slow, and the biochemical reactions that allow you to determine the sequence of the DNA molecule you are studying can be very expensive. For this reason, geneticists try to use quicker and cheaper methods to examine DNA sequences, often looking for differences between a tested individual and one whose sequence has already been determined laboriously by the biochemistry and gel methods. The differences between the DNA sequences are our polymorphisms, and they help to determine individual susceptibility to disease, hair colour (assuming you haven’t modified it) and all of the other inherited differences between people.
But most of them have no effect on the person carrying them – they are inherited baggage, markers of your ancestry. These are the markers of greatest interest to anthropologists and historians.

Peter Oefner, our chemist, is a serious, driven Austrian from the Tyrol region near Innsbruck. In the 1990s he was conducting research at Stanford on the separation of DNA molecules using a technique known as High Pressure Liquid Chromatography (HPLC for short). In particular, he was trying to develop a method of identifying the sequence of a DNA molecule using HPLC, which separates molecules much more quickly than gel methods. Peter Underhill saw Oefner’s presentation on the technique at a noontime seminar in the Genetics department. Underhill was immediately struck by its applicability to the problem of finding Y-chromosome polymorphisms, and approached Oefner to ask if he would be interested in collaborating. The pair were soon in a frenzy of work that would see both of them give up their weekends for the next eighteen months.

The partnership between the two Peters would eventually produce a technique known as denaturing HPLC, or dHPLC for short. It makes use of a fortuitous property of DNA molecules: they are double-stranded, paired nucleotide chains held together by a mutual attraction between their constituent nucleotide bases. In the world of DNA, adenine always pairs with thymine, and cytosine always pairs with guanine, owing to the nature of their molecular structure. This means that if you know the sequence of nucleotides in one strand, then you automatically know that of the other strand as well. This has two knock-on effects. First, it stabilizes the DNA molecule, rendering it less susceptible to destruction by enzymes and environmental stress. DNA has been recovered from 50,000-year-old bones, but the single-stranded equivalent also found in our cells, known as RNA, is simply too unstable to last that long. The second benefit of being double-stranded is that it provides a way of backing up the data contained in the nucleotide sequence. If a change (i.e. a mutation) does occur on one strand of the DNA molecule, the mirror-image nucleotide on the opposite strand will no longer pair with it perfectly. There will be a slight ‘kink’ in the strand at this point, due to the mismatched base pairs. The kinks are easily detected by proofreading machinery in the cell, and the damage is repaired.

The technique of dHPLC uses the incredibly sensitive separation technique of HPLC as a substitute for the cellular proofreading machinery. It does this by passing the mismatched DNA molecules through a matrix that retards their movement based on the structure (but not the length) of the molecule. If there is a kink in the strand, the movement is altered, and the mismatched fragments can be detected by a different pattern of migration. This allows you to scan an entire DNA fragment – hundreds of nucleotides in length – for any differences between it and another DNA fragment of known sequence, quickly and cheaply. A fantastic time-saver and a critical leap forward in our ability to ‘sequence’ our genes.

The medical applications of this fancy bit of physical chemistry seem obvious, and the technique has been applied to determine the genetic mutations at the root of several human diseases. But what does it add to the study of ancient migrations? The answer is that, by applying this technique to the same region of DNA in many individuals, we can detect the genetic differences between them. This allows us to assay the level of genetic diversity in the human species rapidly and efficiently, providing a variety of polymorphisms to study. Before this technique was developed, there were perhaps a dozen polymorphisms identified on the Y-chromosome. At last count there were around 400, and the number is increasing weekly. If Rob Dorit and his colleagues had been able to perform their study of Y diversity with dHPLC, they would have found some variation. As often happens in science, technology has opened up a field to new ways of solving old riddles – often providing startling answers.

Other books

Captured by a Laird by Loretta Laird

Understanding Air France 447 by Bill Palmer

Arcane Magic (Stella Mayweather Series) by Camilla Chafer

Brides of War by June Tate

Dark Horse by Marilyn Todd

Rapture's Rendezvous by Cassie Edwards

Ink and Bone by Lisa Unger

The Body and the Blood by Michael Lister

All Yours by Translated By Miranda France By (author) Pineiro Claudia

Bad Luck and Trouble by Lee Child