Julianne Zedalis; John Eggebrecht

14.2 DNA Structure and Sequencing

Learning Objectives

In this section, you will explore the following questions:

What is the molecular structure of DNA?
What is the Sanger method of DNA sequencing? What is an application of DNA sequencing?
What are the similarities and differences between eukaryotic and prokaryotic DNA?

Connection for AP^® Courses

The currently accepted model of the structure of DNA was proposed in 1953 by Watson and Crick, who made their model after seeing a photograph of DNA that Franklin had taken using X-ray crystallography. The photo showed the molecule’s double-helix shape and dimensions. The two strands that make up the double helix are complementary and anti-parallel in nature. That is, one strand runs in the 5' to 3' direction, whereas the complementary strand runs in the 3' to 5' direction. (The significance of directionality will be important when we explore how DNA copies itself.) DNA is a polymer of nucleotides that consists of deoxyribose sugar, a phosphate group, and one of four nitrogenous bases—A, T, C, and G—with a purine always pairing with a pyrimidine (as Chargaff found). The genetic “language” of DNA is found in sequences of the nucleotides. During cell division each daughter cell receives a copy of DNA in a process called replication. In the years since the discovery of the structure of DNA, many technologies, including DNA sequencing, have been developed that enable us to better understand DNA and its role in our genomes.

Information presented and the examples highlighted in the section support concepts outlined in Big Idea 3 of the AP^® Biology Curriculum Framework. The Learning Objectives listed in the Curriculum Framework provide a transparent foundation for the AP^® Biology course, an inquiry-based laboratory experience, instructional activities, and AP^® exam questions. A Learning Objective merges required content with one or more of the seven science practices.

Big Idea 3	Living systems store, retrieve, transmit and respond to information essential to life processes.
Enduring Understanding 3.A	Heritable information provides for continuity of life.
Essential Knowledge	3.A.1 DNA, and in some cases RNA, is the primary source of heritable information.
Science Practice	6.5 The student can evaluate alternative scientific explanations.
Learning Objective	3.1 The student is able to construct scientific explanations that use the structures and mechanisms of DNA to support the claim that DNA is the primary source of heritable information.
Essential Knowledge	3.A.1 DNA, and in some cases RNA, is the primary source of heritable information.
Science Practice	4.1 The student can justify the selection of the kind of data needed to answer a particular scientific question.
Learning Objective	3.2 The student is able to justify the selection of data from historical investigations that support the claim that DNA is the source of heritable information.
Essential Knowledge	3.A.1 DNA, and in some cases RNA, is the primary source of heritable information.
Science Practice	6.4 The student can make claims and predictions about natural phenomena based on scientific theories and models.
Learning Objective	3.5 The student can justify the claim that humans can manipulate heritable information by identifying at least two commonly used technologies.

Teacher Support

Franklin’s X-ray diffraction pictures helped lead to the discovery of the structure of DNA, but Watson and Crick did not mention Franklin in their seminal 1953 paper, which can be found here. This paper includes annotations that help place the work in historical context. Students might be interested to learn how Watson and Crick discovered the structure of DNA. Details can be found at this PBS website. If possible, find a copy of the announcement of the discovery as it appeared in The New York Times. The wording is interesting and the significance of the discovery is understated.

The Science Practice Challenge Questions contain additional test questions for this section that will help you prepare for the AP exam. These questions address the following standards:
[APLO 3.3][APLO 3.5][APLO 3.13]

The building blocks of DNA are nucleotides. The important components of the nucleotide are a nitrogenous base, deoxyribose (5-carbon sugar), and a phosphate group (Figure 14.5). The nucleotide is named depending on the nitrogenous base. The nitrogenous base can be a purine such as adenine (A) and guanine (G), or a pyrimidine such as cytosine (C) and thymine (T).

llustration depicts the structure of a nucleoside, which is made up of a pentose with a nitrogenous base attached at the 1 prime position. There are two kinds of nitrogenous bases: pyrimidines, which have one six-membered ring, and purines, which have a six-membered ring fused to a five-membered ring. Cytosine, thymine, and uracil are pyrimidines, and adenine and guanine are purines. A nucleoside with one phosphate attached at the 5 prime position is called a nucleoside monophosphate. A nucleoside with two or three phosphates attached is called a nucleoside diphosphate or nucleoside triphosphate, respectively.

Figure 14.5 Each nucleotide is made up of a sugar, a phosphate group, and a nitrogenous base. The sugar is deoxyribose in DNA and ribose in RNA.

The nucleotides combine with each other by covalent bonds known as phosphodiester bonds or linkages. The purines have a double ring structure with a six-membered ring fused to a five-membered ring. Pyrimidines are smaller in size; they have a single six-membered ring structure. The carbon atoms of the five-carbon sugar are numbered 1', 2', 3', 4', and 5' (1' is read as “one prime”). The phosphate residue is attached to the hydroxyl group of the 5' carbon of one sugar of one nucleotide and the hydroxyl group of the 3' carbon of the sugar of the next nucleotide, thereby forming a 5'-3' phosphodiester bond.

In the 1950s, Francis Crick and James Watson worked together to determine the structure of DNA at the University of Cambridge, England. Other scientists like Linus Pauling and Maurice Wilkins were also actively exploring this field. Pauling had discovered the secondary structure of proteins using X-ray crystallography. In Wilkins’ lab, researcher Rosalind Franklin was using X-ray diffraction methods to understand the structure of DNA. Watson and Crick were able to piece together the puzzle of the DNA molecule on the basis of Franklin's data because Crick had also studied X-ray diffraction (Figure 14.6). In 1962, James Watson, Francis Crick, and Maurice Wilkins were awarded the Nobel Prize in Medicine. Unfortunately, by then Franklin had died, and Nobel prizes are not awarded posthumously.

The photo in part A shows James Watson, Francis Crick, and Maclyn McCarty. The x-ray diffraction pattern in part b is symmetrical, with dots in an x-shape

Figure 14.6 The work of pioneering scientists (a) James Watson, Francis Crick, and Maclyn McCarty led to our present day understanding of DNA. Scientist Rosalind Franklin discovered (b) the X-ray diffraction pattern of DNA, which helped to elucidate its double helix structure. (credit a: modification of work by Marjorie McCarty, Public Library of Science)

Watson and Crick proposed that DNA is made up of two strands that are twisted around each other to form a right-handed helix. Base pairing takes place between a purine and pyrimidine; namely, A pairs with T and G pairs with C. Adenine and thymine are complementary base pairs, and cytosine and guanine are also complementary base pairs. The base pairs are stabilized by hydrogen bonds; adenine and thymine form two hydrogen bonds and cytosine and guanine form three hydrogen bonds. The two strands are anti-parallel in nature; that is, the 3' end of one strand faces the 5' end of the other strand. The sugar and phosphate of the nucleotides form the backbone of the structure, whereas the nitrogenous bases are stacked inside. Each base pair is separated from the other base pair by a distance of 0.34 nm, and each turn of the helix measures 3.4 nm. Therefore, ten base pairs are present per turn of the helix. The diameter of the DNA double helix is 2 nm, and it is uniform throughout. Only the pairing between a purine and pyrimidine can explain the uniform diameter. The twisting of the two strands around each other results in the formation of uniformly spaced major and minor grooves (Figure 14.7).

Part A shows an illustration of a DNA double helix, which has a sugar-phosphate backbone on the outside and nitrogenous base pairs on the inside. Part B shows base pairing between thymine and adenine, which form two hydrogen bonds, and between guanine and cytosine, which form three hydrogen bonds. Part C shows a molecular model of the DNA double helix. The outside of the helix alternates between wide gaps, called major grooves, and narrow gaps, called minor grooves.

Figure 14.7 DNA has (a) a double helix structure and (b) phosphodiester bonds. The (c) major and minor grooves are binding sites for DNA binding proteins during processes such as transcription (the copying of RNA from DNA) and replication.

Science Practice Connection for AP® Courses

Activity

Read Watson and Crick’s original Nature article, “Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid,” How did Watson and Crick’s model build on the findings of Rosalind Franklin? How did their model of DNA build on the findings of Hershey and Chase, and others, showing that DNA can encode and pass information on to the next generation?

Think About It

Watson and Crick’s work determined the structure of DNA. However, it was still relatively unknown how DNA encoded information into genes. Select one modern form of biotechnology and research its basic methods online. Examples include gene sequencing, DNA fingerprinting, PCR (polymerase chain reaction), genetically-modified food, etc. Briefly describe your chosen technology, and what benefits it provides us. Then describe how Watson and Crick’s findings were vital to the development of your chosen technology.

Teacher Support

The activity is an application of Learning Objective 3.1 and Science Practice 6.5 because students are analyzing Watson and Crick’s model of DNA relative to the findings of other DNA researchers who determined that DNA is the molecule of heredity. The activity is also an application of Learning Objective 3.2 and Science Practice 4.1 because students are analyzing the historic published results of Watson and Crick and selecting evidence that Watson and Crick used to create their model of DNA and further show that DNA is the molecule of heredity.

Possible answer:

Watson and Crick’s model built on Franklin’s findings that DNA has the structure of a double helix. The finding that DNA can pass information on to the next generation by Hershey and Chase was further evidenced by Watson and Crick’s model, which showed that DNA could encode information using the sequence of its four nucleotides.

The Think About It question is an application of Learning Objective 3.5 and Science Practice 6.4 because students are researching the methods by which humans can manipulate heritable information and describing how those methods were based on the scientific theories and models of Watson and Crick.

Possible answer:

PCR allows us to make many copies of DNA for research or other applications. PCR involves separating out the two strands of DNA and adding nucleotides to the specific regions one wishes to amplify. Attaching nucleotide primers allows one to create many copies of only the desired sequences of the DNA. The ability to separate DNA and amplify select regions depends on the knowledge of nucleotide bonding within the DNA molecule described in the Watson and Crick model.

DNA Sequencing Techniques

Until the 1990s, the sequencing of DNA (reading the sequence of DNA) was a relatively expensive and long process. Using radiolabeled nucleotides also compounded the problem through safety concerns. With currently available technology and automated machines, the process is cheap, safer, and can be completed in a matter of hours. Fred Sanger developed the sequencing method used for the human genome sequencing project, which is widely used today (Figure 14.8).

Link to Learning

Visit this site to watch a video explaining the DNA sequence reading technique that resulted from Sanger’s work.

Describe one advantage and a possible limitation to Sanger’s method.

Sanger’s method can be used to sequence more than one strand at a time which is less time consuming. Challenges of Sanger’s method includes its decreased accuracy to sequence DNA strands.
Sanger’s method is a reliable and accurate way of sequencing DNA strands. However, only one strand at a time can be sequenced at a time. Also, it can look for one base only at a time which can be time consuming.
Sanger’s method is highly inexpensive and less accurate. However, it is not readily adaptable to commercial kits.
Sanger’s method is less time consuming and highly accurate. However, it is more expensive than other methods available for sequencing.

The method is known as the dideoxy chain termination method. The sequencing method is based on the use of chain terminators, the dideoxynucleotides (ddNTPs). The dideoxynucleotides, or ddNTPs, differ from the deoxynucleotides by the lack of a free 3' OH group on the five-carbon sugar. If a ddNTP is added to a growing a DNA strand, the chain is not extended any further because the free 3' OH group needed to add another nucleotide is not available. By using a predetermined ratio of deoxynucleotides to dideoxynucleotides, it is possible to generate DNA fragments of different sizes.

Part A shows a template DNA strand and newly synthesized strands that were generated in the presence of dideoxynucleotides that terminate the chain at different points to generate fragments of different sizes. Each dideoxynucleotide is labeled a different color. Part B shows a sequence readout that was generated after the DNA fragments were separated on the basis of size. The color of the fragment indicates the identity of the nucleotide at the end of a given fragment. By reading the colors in order, the DNA sequence can be determined.

Figure 14.8 In Frederick Sanger's dideoxy chain termination method, dye-labeled dideoxynucleotides are used to generate DNA fragments that terminate at different points. The DNA is separated by capillary electrophoresis on the basis of size, and from the order of fragments formed, the DNA sequence can be read. The DNA sequence readout is shown on an electropherogram that is generated by a laser scanner.

The DNA sample to be sequenced is denatured or separated into two strands by heating it to high temperatures. The DNA is divided into four tubes in which a primer, DNA polymerase, and all four nucleotides (A, T, G, and C) are added. In addition to each of the four tubes, limited quantities of one of the four dideoxynucleotides are added to each tube respectively. The tubes are labeled as A, T, G, and C according to the ddNTP added. For detection purposes, each of the four dideoxynucleotides carries a different fluorescent label. Chain elongation continues until a fluorescent dideoxy nucleotide is incorporated, after which no further elongation takes place. After the reaction is over, electrophoresis is performed. Even a difference in length of a single base can be detected. The sequence is read from a laser scanner. For his work on DNA sequencing, Sanger received a Nobel Prize in chemistry in 1980.

Link to Learning

Sanger’s genome sequencing has led to a race to sequence human genomes at a rapid speed and low cost, often referred to as the $1000 in one day sequence. Learn more by selecting the Sequencing at Speed animation here.

Explain how fast DNA sequencing can change the way doctors treat disease.

Faster genetic sequencing will help in quick analysis of the genetic makeup of bacteria that can cause diseases in humans for better and more efficient treatments. Also, sequencing of a cancerous cell’s DNA can provide better ways to treat or prevent cancer.
Fast DNA sequencing can help us quickly analyze the genetic information of existing only bacteria (not new strains) only that cause disease in humans, which may lead to more efficient treatments.
Fast DNA sequencing can help doctors to treat and diagnose diseases which are not rare in populations.
Faster genetic sequencing can be used to treat and prevent a few types of cancers and thus increase the life expectancy of patients suffering from the diseases.

Gel electrophoresis is a technique used to separate DNA fragments of different sizes. Usually the gel is made of a chemical called agarose. Agarose powder is added to a buffer and heated. After cooling, the gel solution is poured into a casting tray. Once the gel has solidified, the DNA is loaded on the gel and electric current is applied. The DNA has a net negative charge and moves from the negative electrode toward the positive electrode. The electric current is applied for sufficient time to let the DNA separate according to size; the smallest fragments will be farthest from the well (where the DNA was loaded), and the heavier molecular weight fragments will be closest to the well. Once the DNA is separated, the gel is stained with a DNA-specific dye for viewing it (Figure 14.9).

Photo shows an agarose gel illuminated under UV light. The gel is nine lanes across. Each lane was loaded with a sample containing DNA fragments of differing size that have separated as they travel through the gel, from top to bottom. The DNA appears as thin, white bands on a black background. Lanes one and nine contain many bands from a DNA standard. These bands are closely spaced toward the top, and spaced farther apart further down the gel. Lanes two through eight contain one or two bands each. Some of these bands are identical in size and run the same distance into the gel. Others run a slightly different distance, indicating a small difference in size.

Figure 14.9 DNA can be separated on the basis of size using gel electrophoresis. (credit: James Jacob, Tompkins Cortland Community College)

Evolution Connection

Neanderthal Genome: How Are We Related?

The first draft sequence of the Neanderthal genome was recently published by Richard E. Green et al. in 2010.¹ Neanderthals are the closest ancestors of present-day humans. They were known to have lived in Europe and Western Asia before they disappeared from fossil records approximately 30,000 years ago. Green’s team studied almost 40,000-year-old fossil remains that were selected from sites across the world. Extremely sophisticated means of sample preparation and DNA sequencing were employed because of the fragile nature of the bones and heavy microbial contamination. In their study, the scientists were able to sequence some four billion base pairs. The Neanderthal sequence was compared with that of present-day humans from across the world. After comparing the sequences, the researchers found that the Neanderthal genome had 2 to 3 percent greater similarity to people living outside Africa than to people in Africa. While current theories have suggested that all present-day humans can be traced to a small ancestral population in Africa, the data from the Neanderthal genome may contradict this view. Green and his colleagues also discovered DNA segments among people in Europe and Asia that are more similar to Neanderthal sequences than to other contemporary human sequences. Another interesting observation was that Neanderthals are as closely related to people from Papua New Guinea as to those from China or France. This is surprising because Neanderthal fossil remains have been located only in Europe and West Asia. Most likely, genetic exchange took place between Neanderthals and modern humans as modern humans emerged out of Africa, before the divergence of Europeans, East Asians, and Papua New Guineans.

Several genes seem to have undergone changes from Neanderthals during the evolution of present-day humans. These genes are involved in cranial structure, metabolism, skin morphology, and cognitive development. One of the genes that is of particular interest is RUNX2, which is different in modern day humans and Neanderthals. This gene is responsible for the prominent frontal bone, bell-shaped rib cage, and dental differences seen in Neanderthals. It is speculated that an evolutionary change in RUNX2 was important in the origin of modern-day humans, and this affected the cranium and the upper body.

(credit: modification of work by M. D. Golubovsky/ResearchGate)

The table shows the relative sizes of the genomes of various organisms. What conclusion can be drawn from this image?

Viruses have genome sizes that are larger than bacteria.
Simple eukaryotes have genome sizes similar to bacteria.
The genome sizes of different animals are roughly similar.
Mammals have much larger genomes than simpler animals.

Link to Learning

Watch Svante Pääbo’s talk explaining the Neanderthal genome research at the 2011 annual TED (Technology, Entertainment, Design) conference.

Which of the statements gives the best explanation for the wider genetic variation in the human population in Africa than the rest of the world?

It has been suggested that all humans most likely descended from Africa. This is supported by the research that genetic variance in Africa was also found in the rest of the world.
The theory that humans descended from Africa was supported by the research that some of the human genomes tested outside of Africa had close ties to the genomes of people in Africa but a genetic variance in Africa was not found in the rest of the world.
Humans have most likely descended from Africa. This research is supported by the fact that all the human genomes tested outside of Africa had close ties to the genomes of people in Africa. Also, there is a genetic variance in Africa that was not found in the rest of the world.
The transition to modern humans occurred within Africa which was sudden. Thus, human genomes tested outside of Africa had close ties to the genomes of people in Africa.

DNA Packaging in Cells

When comparing prokaryotic cells to eukaryotic cells, prokaryotes are much simpler than eukaryotes in many of their features (Figure 14.10). Most prokaryotes contain a single, circular chromosome that is found in an area of the cytoplasm called the nucleoid.

Visual Connection

Illustration shows a eukaryotic cell, which has a membrane-bound nucleus containing chromatin and a nucleolus, and a prokaryotic cell, which has DNA contained in an area of the cytoplasm called the nucleoid. The prokaryotic cell is much smaller than the eukaryotic cell.

Figure 14.10 A eukaryote contains a well-defined nucleus, whereas in prokaryotes, the chromosome lies in the cytoplasm in an area called the nucleoid.

In eukaryotic cells, DNA and RNA synthesis occur in a separate compartment from protein synthesis. In prokaryotic cells, both processes occur together. What advantages might there be to separating the processes? What advantages might there be to having them occur together?

Compartmentalization in eukaryotic cells enables the building of more complex proteins and RNA products. In prokaryotes, the advantage is that RNA and protein synthesis occurs much more quickly because it occurs in a single compartment.
Compartmentalization in prokaryotic cells enables the building of more complex proteins and RNA products. In eukaryotes, the advantage is that RNA and protein synthesis occurs much more quickly because they occur in a single compartment.
Compartmentalization in eukaryotic cells enables the building of simpler proteins and RNA products. In prokaryotes, the advantage is only simpler proteins and RNA products because complex ones are not needed.
Compartmentalization in eukaryotic cells enables the building of more complex proteins and RNA products. In prokaryotes, the advantage is that RNA and protein synthesis takes more time because it occurs in a single compartment.

The size of the genome in one of the most well-studied prokaryotes, E.coli, is 4.6 million base pairs (approximately 1.1 mm, if cut and stretched out). So how does this fit inside a small bacterial cell? The DNA is twisted by what is known as supercoiling. Supercoiling means that DNA is either under-wound (less than one turn of the helix per 10 base pairs) or over-wound (more than 1 turn per 10 base pairs) from its normal relaxed state. Some proteins are known to be involved in the supercoiling; other proteins and enzymes such as DNA gyrase help in maintaining the supercoiled structure.

Eukaryotes, whose chromosomes each consist of a linear DNA molecule, employ a different type of packing strategy to fit their DNA inside the nucleus (Figure 14.11). At the most basic level, DNA is wrapped around proteins known as histones to form structures called nucleosomes. The histones are evolutionarily conserved proteins that are rich in basic amino acids and form an octamer. The DNA (which is negatively charged because of the phosphate groups) is wrapped tightly around the histone core. This nucleosome is linked to the next one with the help of a linker DNA. This is also known as the “beads on a string” structure. This is further compacted into a 30 nm fiber, which is the diameter of the structure. At the metaphase stage, the chromosomes are at their most compact, are approximately 700 nm in width, and are found in association with scaffold proteins.

In interphase, eukaryotic chromosomes have two distinct regions that can be distinguished by staining. The tightly packaged region is known as heterochromatin, and the less dense region is known as euchromatin. Heterochromatin usually contains genes that are not expressed, and is found in the regions of the centromere and telomeres. The euchromatin usually contains genes that are transcribed, with DNA packaged around nucleosomes but not further compacted.

Illustration shows the levels of organization of eukaryotic chromosomes, starting with the DNA double helix, which wraps around histone proteins. The entire DNA molecule wraps around many clusters of histone proteins, forming a structure that looks like beads on a string. The chromatin is further condensed by wrapping around a protein core. The result is a compact chromosome, shown in duplicated form.

Figure 14.11 These figures illustrate the compaction of the eukaryotic chromosome.

Footnotes

1Richard E. Green et al., “A Draft Sequence of the Neandertal Genome,” Science 328 (2010): 710-22.