Mary Ann Clark; Matthew Douglas; Jung Choi

21.1 Viral Evolution, Morphology, and Classification

Learning Objectives

By the end of this section, you will be able to do the following:

Describe how viruses were first discovered and how they are detected
Discuss three hypotheses about how viruses evolved
Describe the general structure of a virus
Recognize the basic shapes of viruses
Understand past and emerging classification systems for viruses
Describe the basis for the Baltimore classification system

Viruses are diverse entities: They vary in structure, methods of replication, and the hosts they infect. Nearly all forms of life—from prokaryotic bacteria and archaeans, to eukaryotes such as plants, animals, and fungi—have viruses that infect them. While most biological diversity can be understood through evolutionary history (such as how species have adapted to changing environmental conditions and how different species are related to one another through common descent), much about virus origins and evolution remains unknown.

Discovery and Detection

Viruses were first discovered after the development of a porcelain filter—the Chamberland-Pasteur filter—that could remove all bacteria visible in the microscope from any liquid sample. In 1886, Adolph Meyer demonstrated that a disease of tobacco plants—tobacco mosaic disease—could be transferred from a diseased plant to a healthy one via liquid plant extracts. In 1892, Dmitri Ivanowski showed that this disease could be transmitted in this way even after the Chamberland-Pasteur filter had removed all viable bacteria from the extract. Still, it was many years before it was proved that these “filterable” infectious agents were not simply very small bacteria but were a new type of very small, disease-causing particle.

Most virions, or single virus particles, are very small, about 20 to 250 nanometers in diameter. However, some recently discovered viruses from amoebae range up to 1000 nm in diameter. With the exception of large virions, like the poxvirus and other large DNA viruses, viruses cannot be seen with a light microscope. It was not until the development of the electron microscope in the late 1930s that scientists got their first good view of the structure of the tobacco mosaic virus (TMV) (Figure 21.1), discussed above, and other viruses (Figure 21.2). The surface structure of virions can be observed by both scanning and transmission electron microscopy, whereas the internal structures of the virus can only be observed in images from a transmission electron microscope. The use of electron microscopy and other technologies has allowed for the discovery of many viruses of all types of living organisms.

Micrograph a shows a virus with a hexagonal head that stands on thin, bent legs. The virus sits on the surface of a cell that is so large that only a small fraction of its surface is visible. Micrograph b shows small bacterial cells that are about the size of the organelles in the adjacent colon cells.

Figure 21.2 Most virus particles are visible only by electron microscopy. In these transmission electron micrographs, (a) a virus is as dwarfed by the bacterial cell it infects, as (b) these E. coli cells are dwarfed by cultured colon cells. (credit a: modification of work by U.S. Dept. of Energy, Office of Science, LBL, PBD; credit b: modification of work by J.P. Nataro and S. Sears, unpub. data, CDC; scale-bar data from Matt Russell)

Evolution of Viruses

Although biologists have a significant amount of knowledge about how present-day viruses mutate and adapt, much less is known about how viruses originated in the first place. When exploring the evolutionary history of most organisms, scientists can look at fossil records and similar historic evidence. However, viruses do not fossilize, as far as we know, so researchers must extrapolate from investigations of how today’s viruses evolve and by using biochemical and genetic information to create speculative virus histories.

Most scholars agree that viruses don’t have a single common ancestor, nor is there a single reasonable hypothesis about virus origins. There are current evolutionary scenarios that may explain the origin of viruses. One such hypothesis, the “devolution” or the regressive hypothesis, suggests that viruses evolved from free-living cells, or from intracellular prokaryotic parasites. However, many components of how this process might have occurred remain a mystery. A second hypothesis, the escapist or the progressive hypothesis, suggests that viruses originated from RNA and DNA molecules, or self-replicating entities similar to transposons or other mobile genetic elements, that escaped from a host cell with the ability to enter another. A third hypothesis, the virus first hypothesis, suggests that viruses may have been the first self-replicating entities before the first cells. In all cases, viruses are probably continuing to evolve along with the cells on which they rely on as hosts.

As technology advances, scientists may develop and refine additional hypotheses to explain the origins of viruses. The emerging field called virus molecular systematics attempts to do just that through comparisons of sequenced genetic material. These researchers hope one day to better understand the origin of viruses—a discovery that could lead to advances in the treatments for the ailments they produce.

Viral Morphology

Viruses are noncellular, meaning they are biological entities that do not have a cellular structure. They therefore lack most of the components of cells, such as organelles, ribosomes, and the plasma membrane. A virion consists of a nucleic acid core, an outer protein coating or capsid, and sometimes an outer envelope made of protein and phospholipid membranes derived from the host cell. Viruses may also contain additional proteins, such as enzymes, within the capsid or attached to the viral genome. The most obvious difference between members of different viral families is the variation in their morphology, which is quite diverse. An interesting feature of viral complexity is that the complexity of the host does not necessarily correlate with the complexity of the virion. In fact, some of the most complex virion structures are found in the bacteriophages—viruses that infect the simplest living organisms, bacteria.

Morphology

Viruses come in many shapes and sizes, but these features are consistent for each viral family. As we have seen, all virions have a nucleic acid genome covered by a protective capsid. The proteins of the capsid are encoded in the viral genome, and are called capsomeres. Some viral capsids are simple helices or polyhedral “spheres,” whereas others are quite complex in structure (Figure 21.3).

Figure a is a helical virus which has a long linear structure. The outer proteins are small spheres arranged into a long, hollow tube. Inside the tube is the genetic material. Tobacco mosaic virus is an example of a helical virus. Figure b is an icosahedral virus which has a polyhedron structure. The example shown is human rhinovirus which has a pentagon structure. Complex viruses have a more complex structure in figure C. The example is variola which has an ovoid structure.

Figure 21.3 Viral capsids can be (a) helical, (b) polyhedral, or (c) have a complex shape. (credit a “micrograph”: modification of work by USDA ARS; credit b “micrograph”: modification of work by U.S. Department of Energy)

In general, the capsids of viruses are classified into four groups: helical, icosahedral, enveloped, and head-and-tail. Helical capsids are long and cylindrical. Many plant viruses are helical, including TMV. Icosahedral viruses have shapes that are roughly spherical, such as those of poliovirus or herpesviruses. Enveloped viruses have membranes derived from the host cell that surrounds the capsids. Animal viruses, such as HIV, are frequently enveloped. Head-and-tail viruses infect bacteria and have a head that is similar to icosahedral viruses and a tail shaped like helical viruses.

Many viruses use some sort of glycoprotein to attach to their host cells via molecules on the cell called viral receptors. For these viruses, attachment is required for later penetration of the cell membrane; only after penetration takes place can the virus complete its replication inside the cell. The receptors that viruses use are molecules that are normally found on cell surfaces and have their own physiological functions. It appears that viruses have simply evolved to make use of these molecules for their own replication. For example, HIV uses the CD4 molecule on T lymphocytes as one of its receptors (Figure 21.4). CD4 is a type of molecule called a cell adhesion molecule, which functions to keep different types of immune cells in close proximity to each other during the generation of a T lymphocyte immune response.

In the illustration an H I V virus attaches to a C D 4 molecule embedded in the plasma membrane of a host immune cell.

Figure 21.4 A virus and its host receptor protein. The HIV virus binds the CD4 receptor on the surface of human cells. CD4 receptors help white blood cells to communicate with other cells of the immune system when producing an immune response. (credit: modification of work by NIAID, NIH)

One of the most complex virions known, the T4 bacteriophage (which infects the Escherichia coli) bacterium, has a tail structure that the virus uses to attach to host cells and a head structure that houses its DNA.

Adenovirus, a non-enveloped animal virus that causes respiratory illnesses in humans, uses glycoprotein spikes protruding from its capsomeres to attach to host cells. Non-enveloped viruses also include those that cause polio (poliovirus), plantar warts (papillomavirus), and hepatitis A (hepatitis A virus).

Enveloped virions, such as the influenza virus, consist of nucleic acid (RNA in the case of influenza) and capsid proteins surrounded by a phospholipid bilayer envelope that contains virus-encoded proteins. Glycoproteins embedded in the viral envelope are used to attach to host cells. Other envelope proteins are the matrix proteins that stabilize the envelope and often play a role in the assembly of progeny virions. Chicken pox, HIV, and mumps are other examples of diseases caused by viruses with envelopes. Because of the fragility of the envelope, non-enveloped viruses are more resistant to changes in temperature, pH, and some disinfectants than enveloped viruses.

Overall, the shape of the virion and the presence or absence of an envelope tell us little about what disease the virus may cause or what species it might infect, but they are still useful means to begin viral classification (Figure 21.5).

Visual Connection

Illustration a shows bacteriophage T 4, which houses its D N A genome in a hexagonal head. A long, straight tail extends from the bottom of the head. Tail fibers attached to the base of the tail are bent, like spider legs. In b, adenovirus houses its D N A genome in a round capsid made from many small capsomere subunits. Glycoproteins extend from the capsomere, like pins from a pincushion. In c, the influenza virus houses its R N A genome and a bullet-shaped capsid. A spherical viral envelope, lined with matrix proteins, surrounds the capsid. Two different varieties of glycoprotein spike are embedded in the envelope. Approximately 80 percent of the spikes are hemagglutinin. The remaining 20 percent or so of the glycoprotein spikes consist of neuraminidase.

Figure 21.5 Complex Viruses. Viruses can be either complex or relatively simple in shape. This figure shows three relatively complex virions: the bacteriophage T4, with its DNA-containing head group and tail fibers that attach to host cells; adenovirus, which uses spikes from its capsid to bind to host cells; and the influenza virus, which uses glycoproteins embedded in its envelope to bind to host cells. The influenza virus also has matrix proteins, internal to the envelope, which help stabilize the virion’s shape. (credit “bacteriophage, adenovirus”: modification of work by NCBI, NIH; credit "influenza virus": modification of work by Dan Higgins, Centers for Disease Control and Prevention)

Which of the following statements about virus structure is true?

All viruses are encased in a viral membrane.
The capsomere is made up of small protein subunits called capsids.
DNA is the genetic material in all viruses.
Glycoproteins help the virus attach to the host cell.

Types of Nucleic Acid

Unlike nearly all living organisms that use DNA as their genetic material, viruses may use either DNA or RNA. The virus core contains the genome—the total genetic content of the virus. Viral genomes tend to be small, containing only those genes that encode proteins which the virus cannot get from the host cell. This genetic material may be single- or double-stranded. It may also be linear or circular. While most viruses contain a single nucleic acid, others have genomes divided into several segments. The RNA genome of the influenza virus is segmented, which contributes to its variability and continuous evolution, and explains why it is difficult to develop a vaccine against it.

In DNA viruses, the viral DNA directs the host cell’s replication proteins to synthesize new copies of the viral genome and to transcribe and translate that genome into viral proteins. Human diseases caused by DNA viruses include chickenpox, hepatitis B, and adenoviruses. Sexually transmitted DNA viruses include the herpes virus and the human papilloma virus (HPV), which has been associated with cervical cancer and genital warts.

RNA viruses contain only RNA as their genetic material. To replicate their genomes in the host cell, the RNA viruses must encode their own enzymes that can replicate RNA into RNA or, in the retroviruses, into DNA. These RNA polymerase enzymes are more likely to make copying errors than DNA polymerases, and therefore often make mistakes during transcription. For this reason, mutations in RNA viruses occur more frequently than in DNA viruses. This causes them to change and adapt more rapidly to their host. Human diseases caused by RNA viruses include influenza, hepatitis C, measles, and rabies. The HIV virus, which is sexually transmitted, is an RNA retrovirus.

The Challenge of Virus Classification

Because most viruses probably evolved from different ancestors, the systematic methods that scientists have used to classify prokaryotic and eukaryotic cells are not very useful. If viruses represent “remnants” of different organisms, then even genomic or protein analysis is not useful. Why?, Because viruses have no common genomic sequence that they all share. For example, the 16S rRNA sequence so useful for constructing prokaryote phylogenies is of no use for a creature with no ribosomes! Biologists have used several classification systems in the past. Viruses were initially grouped by shared morphology. Later, groups of viruses were classified by the type of nucleic acid they contained, DNA or RNA, and whether their nucleic acid was single- or double-stranded. However, these earlier classification methods grouped viruses differently, because they were based on different sets of characters of the virus. The most commonly used classification method today is called the Baltimore classification scheme, and is based on how messenger RNA (mRNA) is generated in each particular type of virus.

Past Systems of Classification

Viruses contain only a few elements by which they can be classified: the viral genome, the type of capsid, and the envelope structure for the enveloped viruses. All of these elements have been used in the past for viral classification (Table 21.1 and Figure 21.6). Viral genomes may vary in the type of genetic material (DNA or RNA) and its organization (single- or double-stranded, linear or circular, and segmented or non-segmented). In some viruses, additional proteins needed for replication are associated directly with the genome or contained within the viral capsid.

Virus Classification by Genome Structure

Genome Structure	Examples
RNA DNA	Rabies virus, retroviruses Herpesviruses, smallpox virus
Single-stranded Double-stranded	Rabies virus, retroviruses Herpesviruses, smallpox virus
Linear Circular	Rabies virus, retroviruses, herpesviruses, smallpox virus Papillomaviruses, many bacteriophages
Non-segmented: genome consists of a single segment of genetic material Segmented: genome is divided into multiple segments	Parainfluenza viruses Influenza viruses

Table 21.1

Part a is an illustration of the rabies virus, which is bullet shaped. R N A is coiled inside a capsid, which is encased in a matrix protein lined viral envelope studded with glycoproteins. Part a below is a micrograph of a cluster of bullet shaped rabies viruses. Part b is a micrograph of variola virus, which has D N A encased in a bow shaped capsid. An oval matrix protein-lined envelope surrounds the capsid. Part b below shows irregular, bumpy lesions on the arms and legs of a person with smallpox.

Figure 21.6 Viruses can be classified according to their core genetic material and capsid design. (a) Rabies virus has a single-stranded RNA (ssRNA) core and an enveloped helical capsid, whereas (b) variola virus, the causative agent of smallpox, has a double-stranded DNA (dsDNA) core and a complex capsid. Rabies transmission occurs when saliva from an infected mammal enters a wound. The virus travels through neurons in the peripheral nervous system to the central nervous system, where it impairs brain function, and then travels to other tissues. The virus can infect any mammal, and most die within weeks of infection. Smallpox is a human virus transmitted by inhalation of the variola virus, localized in the skin, mouth, and throat, which causes a characteristic rash. Before its eradication in 1979, infection resulted in a 30 to 35 percent mortality rate. (credit “rabies diagram”: modification of work by CDC; “rabies micrograph”: modification of work by Dr. Fred Murphy, CDC; credit “small pox micrograph”: modification of work by Dr. Fred Murphy, Sylvia Whitfield, CDC; credit “smallpox photo”: modification of work by CDC; scale-bar data from Matt Russell)

Viruses can also be classified by the design of their capsids (Table 21.2 and Figure 21.7). Capsids are classified as naked icosahedral, enveloped icosahedral, enveloped helical, naked helical, and complex. The type of genetic material (DNA or RNA) and its structure (single- or double-stranded, linear or circular, and segmented or non-segmented) are used to classify the virus core structures (Table 21.2).

Virus Classification by Capsid Structure

Capsid Classification	Examples
Naked icosahedral	Hepatitis A virus, polioviruses
Enveloped icosahedral	Epstein-Barr virus, herpes simplex virus, rubella virus, yellow fever virus, HIV-1
Enveloped helical	Influenza viruses, mumps virus, measles virus, rabies virus
Naked helical	Tobacco mosaic virus
Complex with many proteins; some have combinations of icosahedral and helical capsid structures	Herpesviruses, smallpox virus, hepatitis B virus, T4 bacteriophage

Table 21.2

Micrograph a shows icosahedral polioviruses arranged in a grid; micrograph b shows two Epstein-Barr viruses with icosahedral capsids encased in an oval membrane; micrograph c shows a mumps virus capsid encased in an irregular membrane; micrograph d shows rectangular tobacco mosaic virus capsids; and micrograph e shows a spherical herpesvirus envelope studded with glycoproteins.

Figure 21.7 Transmission electron micrographs of various viruses show their capsid structures. The capsid of the (a) polio virus is naked icosahedral; (b) the Epstein-Barr virus capsid is enveloped icosahedral; (c) the mumps virus capsid is an enveloped helix; (d) the tobacco mosaic virus capsid is naked helical; and (e) the herpesvirus capsid is complex. (credit a: modification of work by Dr. Fred Murphy, Sylvia Whitfield; credit b: modification of work by Liza Gross; credit c: modification of work by Dr. F. A. Murphy, CDC; credit d: modification of work by USDA ARS; credit e: modification of work by Linda Stannard, Department of Medical Microbiology, University of Cape Town, South Africa, NASA; scale-bar data from Matt Russell)

Baltimore Classification

The most commonly and currently used system of virus classification was first developed by Nobel Prize-winning biologist David Baltimore in the early 1970s. In addition to the differences in morphology and genetics mentioned above, the Baltimore classification scheme groups viruses according to how the mRNA is produced during the replicative cycle of the virus.

Group I viruses contain double-stranded DNA (dsDNA) as their genome. Their mRNA is produced by transcription in much the same way as with cellular DNA, using the enzymes of the host cell.

Group II viruses have single-stranded DNA (ssDNA) as their genome. They convert their single-stranded genomes into a dsDNA intermediate before transcription to mRNA can occur.

Group III viruses use dsRNA as their genome. The strands separate, and one of them is used as a template for the generation of mRNA using the RNA-dependent RNA polymerase encoded by the virus.

Group IV viruses have ssRNA as their genome with a positive polarity, which means that the genomic RNA can serve directly as mRNA. Intermediates of dsRNA, called replicative intermediates, are made in the process of copying the genomic RNA. Multiple, full-length RNA strands of negative polarity (complementary to the positive-stranded genomic RNA) are formed from these intermediates, which may then serve as templates for the production of RNA with positive polarity, including both full-length genomic RNA and shorter viral mRNAs.

Group V viruses contain ssRNA genomes with a negative polarity, meaning that their sequence is complementary to the mRNA. As with Group IV viruses, dsRNA intermediates are used to make copies of the genome and produce mRNA. In this case, the negative-stranded genome can be converted directly to mRNA. Additionally, full-length positive RNA strands are made to serve as templates for the production of the negative-stranded genome.

Group VI viruses have diploid (two copies) ssRNA genomes that must be converted, using the enzyme reverse transcriptase, to dsDNA; the dsDNA is then transported to the nucleus of the host cell and inserted into the host genome. Then, mRNA can be produced by transcription of the viral DNA that was integrated into the host genome.

Group VII viruses have partial dsDNA genomes and make ssRNA intermediates that act as mRNA, but are also converted back into dsDNA genomes by reverse transcriptase, necessary for genome replication.

The characteristics of each group in the Baltimore classification are summarized in Table 21.3 with examples of each group.

Baltimore Classification

Group	Characteristics	Mode of mRNA Production	Example
I	Double-stranded DNA	mRNA is transcribed directly from the DNA template	Herpes simplex (herpesvirus)
II	Single-stranded DNA	DNA is converted to double-stranded form before RNA is transcribed	Canine parvovirus (parvovirus)
III	Double-stranded RNA	mRNA is transcribed from the RNA genome	Childhood gastroenteritis (rotavirus)
IV	Single stranded RNA (+)	Genome functions as mRNA	Common cold (picornavirus)
V	Single stranded RNA (-)	mRNA is transcribed from the RNA genome	Rabies (rhabdovirus)
VI	Single stranded RNA viruses with reverse transcriptase	Reverse transcriptase makes DNA from the RNA genome; DNA is then incorporated in the host genome; mRNA is transcribed from the incorporated DNA	Human immunodeficiency virus (HIV)
VII	Double stranded DNA viruses with reverse transcriptase	The viral genome is double-stranded DNA, but viral DNA is replicated through an RNA intermediate; the RNA may serve directly as mRNA or as a template to make mRNA	Hepatitis B virus (hepadnavirus)

Table 21.3