Regenesis (45 page)

Read Regenesis Online

Authors: George M. Church

Hapgood, Fred.
Up the Infinite Corridor: MIT and the Technical Imagination
. Cambridge, MA: Perseus, 1993.

Knight, Tom. Email to Ed Regis, September 10, 2010.

___. “Idempotent Vector Design for Standard Assembly of Biobricks.” MIT Artificial Intelligence Laboratory, 2003.
web.mit.edu/synbio/release/docs/biobricks.pdf
.

Levskaya, A. “Synthetic Biology: Engineering
Escherichia coli
to See Light.”
Nature
, November 24, 2005, 441–442.

Mooallem, Jon. “Do-It-Yourself Genetic Engineering.”
New York Times Magazine
, February 10, 2010.

Morton, Oliver. “Life, Reinvented.”
Wired
13 (2005).

Surowiecki, James. “Turn of the Century.”
Wired
10 (2002).

Web

2010.igem.org/Team:The_Citadel-Charleston/TeamPage

2010.igem.org/Main_Page

2010.igem.org/Team:Hong_Kong-CUHK

biobricks.organdpartsregistry.org

edge.org/3rd_culture/endy08/endy08_index.html

CHAPTER 9

Angrist, Misha.
Here Is a Human Being: At the Dawn of Personal Genomics
. New York: HarperCollins, 2010.

Church, George M. “Genomes for All”
Scientific American
, January 2006.

Darcé, Keith. “Study: Genetic Tests Do Little to Change Habits.”
San Diego Union-Tribune
, January 17, 2011.

Hall, Stephen S. “The Genome's Dark Matter.”
Technology Review
, January-February 2011.

Harmon, Amy. “My Genome, Myself: Seeking Clues in DNA.”
New York Times
, November 17, 2007.

Judson, Olivia. “The Human Phenome Project.”
New York Times
, June 8, 2010.

Kolata, Gina. “Staphs Trail Points to Human Susceptibilities.”
New York Times
, December 15, 2010.

Lunshof, J. E., Bobe, J., Aach, J., Angrist, M., Thakuria, J. V., Vorhaus, D. B., Hoehe, M. R., Church, G. M. “Personal genomes in progress: from the human genome project to the personal genome project.”
Dialogues Clin Neurosci
. 2010; 12(1):47–60.

Pinker, Steven. “My Genome, My Self.”
New York Times
, January 11, 2009.

Shannon, C. E. “A Mathematical Theory of Communication.”
Bell System Technical Journal
27 (1948): 379–423, 623–656.

EPILOGUE

Bailey, Ronald. “The Case for Enhancing People.”
New Atlantis
, Summer 2011.

Carlson, Robert H.
Biology Is Technology
. Cambridge: Harvard University Press, 2010.

Church, George. “Let Us Go Forth and Safely Multiply.”
Nature
, November 24, 2005, 423.

_________. “A Synthetic Biohazard Nonproliferation Proposal” 2004. arep.med.harvard.edu/SBP/Church_Biohazard04c.htm.

Crichton, Michael.
The Andromeda Strain
. New York: Knopf, 1969.

Donnell, David. Email to Ed Regis, October 1, 2010.

Enriquez, Juan, and Steve Gullans.
Homo Evolutis
. Ted Books, 2011.

Fleming, Diane O., and Debra Long Hunt, eds.
Biological Safety: Principles and Practices
. American Society for Microbiology, 2006.

Fukuyama, Francis. “Transhumanism.”
Foreign Policy
, September 1, 2004.

“Garage Biology.”
Nature
, October 7, 2010, 634.

Garfinkel, Michele S., et al. “Synthetic Genomics: Options for Governance.” J. Craig Venter Institute/Center for Strategic and International Studies/MIT, October, 2007.

Gentry, Eri. Email to Ed Regis, October 28, 2010.

Ledford, Heidi. “Life Hackers.”
Nature
, October 7, 2010, 650–652.

Pais, Abraham.
Inward Bound: Of Matter and Forces in the Physical World
. Oxford: Clarendon, 1986.

Parker, E. S., et al. “A Case of Unusual Autobiographical Remembering”
Neurocase
, February 2006, 35–49.

Popkin, Jim. “Authorities in Awe of Drug Runners' Jungle-Built, Kevlar-Coated Supersubs.”
Wired
, April 2011.

Price, Jill.
The Woman Who Can't Forget
. New York: Free Press, 2008.

Regis, Ed.
Great Mambo Chicken and the Transhuman Condition
. Reading, MA: Addison-Wesley, 1990.

Schmidt, Markus. “Diffusion of Synthetic Biology: A Challenge to Biosafety.”
Systems and Synthetic Biology
2 (2008): 1–6. doi:10.1007/sn693–008–9018–z.

Specter, Michael. “A Life of Its Own: The Future of Synthetic Biology.”
New Yorker
, September 28, 2009, 56–65.

Taleb, Nassim Nicholas.
The Black Swan: The Impact of the Highly Improbable
. New York: Random House, 2007.

Tucker, Jonathan B., and Raymond A. Zilinkas. “The Promise and Perils of Synthetic Biology.”
New Atlantis
, Spring 2006, 25–45.

Wolfram, Stephen.
A New Kind of Science
. Champaign, IL: Wolfram Media, 2002.

Web

biocurious.org

biohack.sourceforge.net

diybio.org

ILLUSTRATION SOURCES

NOTES

Bancroft, C., et al. “Long-term Storage of Information in DNA.”
Science
293 (2001): 1763–1765.

Church, G. M., Y. Gao, and S. Sri Kosuri. “Next-Generation Digital Storage in DNA.” Submitted 2012.

Davis, J. “Microvenus.”
Art Journal
55, no. 1 (1996).
www.jstor.org/pss/777811
.

_____________. “Romance, Supercodes, and the Milky Way DNA.”
Ars Electronica
2000.

FIGURE CREDITS

Figure 1.1
Hands Make Anti-hands
... Sculpture. George M. Church (GMC) and Marie Wu.

Figure 1.2
Handedness. From
Hands Make Anti-hands
. Rasmol image from coordinates of an amino acid.
umass.edu/microbio/rasmol
.

Figure 1.3
Chiral crystals.
en.wikipedia.org/wiki/File:Pcrystals.svg
.

Figure 1.4
Three base pairs. GMC.

Figure 2.1
Liposome and protein pore. GMC, based on RCSC 7AHL.pdb coordinates and
users.humboldt.edu/rpaselk/C438.S11/C438Notes/C438nLec06.htm
.

Figure 3.1
Transfer RNA. GMC, combining Kim et al. “The General Structure of Transfer RNA Molecules,”
PNAS
71 (1974): 4970–4974 (Figure 2); and Xiao et al., “Structural Basis of Specific tRNA Aminoacylation by a Small In Vitro Selected Ribozyme.”
Nature
254 (2008): 358–362.

Figure 3.2
Genetic code clock. GMC, inspired by
http://en.wikipedia.org/wiki/File:GeneticCode21-version-2.svg
.

Figure 3.3
MAGE and CAGE. GMC, adapted from Isaacs et al., “Precise Manipulation of Chromosomes In Vivo Enables Genome-wide Codon Replacement,”
Science
333 (2011): 348–353.

Figure 3.4
DNA clock bondage. GMC, inspired by Salvador Dali's
Persistence of Memory
(1931).

Figure 3.5
DNA log pile. GMC, adapted from Shawn M. Douglas, et al., “Rapid Prototyping of 3D DNA-origami Shapes with caDNAno,”
Nucleic Acids Research
, August 2009, doi:10.1093/nar/gkp436.

Figure 5.1
Antibody.
en.wikipedia.org/wiki/Antibody
.

Figure 5.2
Six Arg codons. See
Figure 3.2
.

Figure 5.3
Six Leu codons. See
Figure 3.2
.

Figure 5.4
The smallest viral genome. GMC, using NCBI NC_001417.

Figure 6.1
Frozen mammoth.
en.wikipedia.org/wiki/File:Jeune_mammouth_IRSNB.JPG
.

Figure 7.1
Exponentials. GMC; see also Carr and Church,
“Genome Engineering” Nature Biotechnology
27 (2009): 1151–1162.

Figure 8.1
Biobrick assembly. GMC, inspired by
partsregistry.org/Assembly
: Standard_assembly.

Figure 8.2
Hello World.
openwetware.org/wiki/IGEM:MIT/Sponsorship
or
en.wikipedia.org/wiki/File:UT_HelloWorld. Jpg from UT Austin/UCSF
iGEM team.

Figure Epilogue
Prohibition plot. ER adapted from Kevin Kelly,
What Technology Wants
(New York: Viking, 2010), p. 241. Used by permission.

Cover
Based on
The Creation of the World
, by Eustache Le Sueur (1617–1655), Musee des Beaux-Arts, Tourcoing, France.

NOTES: ON ENCODING THIS BOOK INTO DNA

As I explain in greater detail below, to test DNA as a super-compact storage system, this book was encoded into DNA, and the resulting sequence was amplified until 40 billion copies of the DNA book had been produced. In what follows I discuss some of the legal, policy, biosafety, and other issues and opportunities pertaining to this process.

First, we'd like a compact, platform-independent file allowing images and robust to errors (mutations). Word processing formats are not really standard. PDF and many compression formats are brittle to small file errors. HTML is easily human readable and has an option for inline images (e.g., encoded in base-64 format).

Second, we consider unintended consequences; ethical, legal, social, policy, security, and safety issues; and contact diverse thought leaders.

(A) DNA could be a form of cryptography, and hence it includes legal restrictions and import/export regulations. The DNA could encode computer viruses (I embedded a mouse tracking program to symbolize a spyware threat) or the computer code could contain human viruses (see item C)—or the whole thing could go viral in a social networking sense.

(B) NIH guideline: “If the synthetic DNA segment is not expressed in vivo as a biologically active polynucleotide or polypeptide product, it is exempt from the NIH Guidelines” Even though these 156 bp long fragments are unlikely to replicate on their own or encode anything biologically active, if placed in the wild, they could get incorporated into a living organism. EPA, FDA, OSHA, USDA, and DHS guidelines might play out similarly to scenario A above.

(C) Certain apparently innocuous digital documents (e.g., images) once converted to DNA by the methods described here could result in infectious DNA molecules, so rules governing DNA surveillance (e.g., “Screening Framework Guidance for Providers of Synthetic Double-Stranded DNA” published by DHHS in the
Federal Register
, November 2009). This document's HTML code, when
converted to DNA form, could (but doesn't) encode parts of one or more select agents, which would have set off alarms at the facility manufacturing the DNA (unless prior explicit justification is on file). However, the current guidelines are only recommendations and apply only to sequences “longer than 200 base pairs (bps)” or “66 amino acid sequences.”

(D) Incorporating DNA into our daily lives and at large scale could make us more habituated or apathetic with respect to safety and security. Yet constant exposure to cars has brought about better safety engineering (shoulder harnesses, airbags, infant carriers, etc.). Ditto for child- and tamper-resistant drug bottles, street lighting, and so on.

Third, how much does such printing cost? We print the initial DNA onto a 2-inch DNA chip for about $1,000. After sequencing to see if any sequences are underrepresented, an additional chip can be made using the redundancy of the coding to make those segments in many ways. From the combined original I made copies via polymerase chain reaction (PCR). Fifty dollars is enough to make 40 billion copies of the book, which if printed onto the first 20,000 book jackets, at 200,000 (barely visible) dots per cover (each dot containing 10 copies of the book) would be more than the sum of the top 150 printed volumes of all time, including
A Tale of Two Cities, Le Petit Prince
, Hong lou meng, the Bible, the Qur'an,
Webster's Dictionary
, Xinhua Zidian,
Boy Scout Manual, Guinness Book of World Records, Don Quixote
, and the full works of Tolkien, J.K. Rowling, Mao, Agatha Christie, and Shakespeare. The point is simply that this printing method is inexpensive and 100 million times more compact than Blu-ray disc data.

Fourth, we need to handle various issues with DNA synthesis and amplification, for example, underrepresentation of extremely high or low G+C content, inverted repeats, and runs of bases (problematic for synthesis and/or sequencing, e.g., GGGG . . . ). We want to minimize missing sections of DNA, and if some are missing have the ability to order the rest. The early draft of this book that I encoded consisted of 53,418 words, 11 JPG images, and 1 Javascript program. The 644 Kbytes fit into 9 M base pairs in 91,401 DNA segments, which I synthesized at our Agilent facility and amplified using PCR. Each of these DNA segments is 96 bp long (encoding 12 bytes). The segment length is set by practical limits of oligo synthesis (Chapter 8). Each oligo begins on the left with an amplification/sequencing 22-mer primer and then a segment number L bits long. For this book, we chose L = 19 (allowing numbering up to 524,288 oligos = 6 Mbytes of text). Next is the payload 96-mer, then finally a 24-mer primer on the far right. So the full oligos are 22 + 19 + 96 + 22 = 159 bp long. The use of segment numbers means that we don't depend on overlapping sequences typical of genome assembly, a very problematic practice due to sequence repeats. In principle, the segment number could have a synthesis and/or sequencing error, but we require that the (consensus) sequence read between the primers be the correct
length (115 bp) and that each 115 bit sequence be observed multiple times to be taken seriously. We can have several encodings of each segment to minimize the impact of any particular encoding that might be intrinsically hard to synthesize, amplify, or sequence. To enable multiple encodings, the zero bit can be either A or C and the one bit G or T. This can be done randomly or in a manner that minimizes problematic sequences. In the 2-bit/bp code described in Chapter 8, we observe (and underline) the unfortunate CCCCCCC four times in the example below—the inevitable consequence of the last two letters of the word OUT (0
1
0
1
0
1
0
1
0
1
0
1
0
1
00 = ASCII “UT”). But these same three instances of those 16 bits were encoded as a
T
c
G
a
G
c
T
c
T
c
G
a
G
ac, c
G
a
G
a
G
a
G
a
T
c
T
a
T
ac, a
G
c
T
c
T
a
G
a
T
c
T
a
T
ac in the 1-bit/bp degenerate code (lowercase a or c for zero, uppercase
G
or
T
for one). DNA supercoding (1.5 bits/bp) from Joe Davis in 2000 was a step in a similar direction, enabling a degenerate base-20 encoding with triplets and exceptions; nevertheless, a 3,867 bp DNA, encoding a photo of the Milky Way, still had five instances of CCCCCCC. In 2004 his DNA manifolds pointed to a potential way to stabilize coded messages in living cells by embedding them in natural protein coding regions at 0.3-bit/bp. We chose the 1-bit/bp code for this book.

Other books

Double Vision by Pat Barker
The Longest Holiday by Paige Toon
Three of Hearts by Kelly Jamieson
The River Maid by Gemma Holden
Transformation: Zombie Crusade VI by Vohs, J.W., Vohs, Sandra
Designed for Love by Yvette Hines