Sanger Dideoxy DNA Sequencing: The First Enzymatic Sequencing Method
Great strides have been made in the last three decades to make DNA sequencing a routine and fairly trivial procedure. Today, an entire human genome can be sequenced for around $5000 in about a day. New technologies such as nanopore sequencing are set to bring that price down to $1000 per genome. In the 1990s, such a feat was unthinkable and even a decade ago, it was not nearly so easy or cheap to sequence a genome. The Human Genome Project, completed in 2003, compiled a composite sequence of the first human genome for a whopping $300 million over more than a decade (the entire project cost $2.7 billion but the raw sequence data was generated for $300 million). The drastic reduction of cost can be linked to the development of so-called next-generation sequencing technologies (future post), but how exactly was DNA sequenced using “first-generation” technologies?
Sanger DNA sequencing (named for its primary developer Frederick Sanger) relies on the chemistry behind DNA chain elongation. When the growing DNA strand is being synthesized, the 3’-OH group of the terminal nucleotide attacks the α phosphate of the incoming deoxynucleotide triphosphate (dNTP), resulting in the release of pyrophosphate (a good leaving group) and incorporation of the nucleotide into the growing strand (shown above middle). If a di-deoxynucleotide lacking the 3’-OH group is incorporated; however, the chain is terminated (above right).
In Sanger sequencing, four separate sequencing reactions are carried out in test tubes containing template DNA, primers, DNA polymerase, all four normal nucleotides (A, T, G, or C), and a small concentration of one of the dideoxynucleotides. The chain will be extended by DNA polymerase and randomly terminated whenever a dideoxynucleotide is incorporated. The radiolabelled DNA is then loaded onto a gel, each reaction in a separate lane, and separated by size using denaturing gel electrophoresis and imaged using an autoradiogram. By looking at the gel, one knows the relative length of the strand and the nucleotide in the terminal position, so by reading the sequence from bottom to top, one can determine the DNA sequence. The image on the far left above shows the sequencing gel from the original paper published by Frederick Sanger in 1977. The downside of this method is that it is labor intensive and it becomes difficult to resolve single base pair differences past 500 nucleotides, so many sequencing reactions must be carried out to sequence longer. It also uses large amounts of radiation to label the DNA. However, this method and the automated fluorescent method developed in 1986 were the primary method of DNA sequencing for more than 25 years. Tomorrow I’ll talk about automated fluorescent Sanger sequencing and after that I’ll talk about next-generation sequencing.
Sanger, F.; Nicklen, S.; Coulson, A. R. Proc. Nat. Acad. Sci. 1977, 74, 5463-5467