key: cord-0701773-vgku5m5r authors: Louten, Jennifer title: Virus Structure and Classification date: 2016-05-06 journal: Essential Human Virology DOI: 10.1016/b978-0-12-800947-5.00002-8 sha: 0bf94abcb4b8888debea24c313ed1a329e7a0bb7 doc_id: 701773 cord_uid: vgku5m5r Viruses have several common characteristics: they are small, have DNA or RNA genomes, and are obligate intracellular parasites. The virus capsid functions to protect the nucleic acid from the environment, and some viruses surround their capsid with a membrane envelope. Most viruses have icosahedral or helical capsid structure, although a few have complex virion architecture. An icosahedron is a geometric shape with 20 sides, each composed of an equilateral triangle, and icosahedral viruses increase the number of structural units in each face to expand capsid size. The classification of viruses is very useful, and the International Committee on Taxonomy of Viruses is the official body that classifies viruses into order, family, genus, and species taxa. There are currently seven orders of viruses. The smallest of viruses are about 20 nm in diameter, although influenza and the human immunodeficiency virus have a more typical size, about 100 nm in diameter. Average human cells are 10-30 μm (microns) in diameter, which means that they are generally 100 to 1000 times larger than the viruses that are infecting them. However, some viruses are significantly larger than 100 nm. Poxviruses, such as the variola virus that causes smallpox, can approach 400 nm in length, and filoviruses, such as the dangerous Ebola virus and Marburg virus, are only 80 nm in diameter but extend into long threads that can reach lengths of over 1000 nm. Several very large viruses that infect amoebas have recently been discovered: megavirus is 400 nm in diameter, and pandoraviruses have an elliptical or ovoid structure approaching 1000 nm in length. It is a common mistake to think that all viruses are smaller than bacteria; most bacteria are typically 2000-3000 nm in size, but certain strains of bacteria called Mycobacteria can be 10 times smaller than this, putting them in the range of these large viruses. So although a characteristic of viruses is that they are all small in size, this ranges from only a few nanometers to larger than some bacteria ( Fig. 2.1 ). Viruses are obligate intracellular parasites, meaning that they are completely dependent upon the internal environment of the cell to create new infectious virus particles, or virions. All viruses make contact with and bind the surface of a cell to gain entry into the cell. The virus disassembles and its genetic material (made of nucleic acid) encodes the instructions for the proteins that will spontaneously assemble into the new virions. This is known as de novo replication, from the Latin for "from new." In contrast to cells, which grow in size and divide equally in two to replicate, viruses use the cell's energy and machinery to create and assemble new virions piece by piece, completely from scratch. 3. The genetic material of viruses can be composed of DNA or RNA. All living cells, whether human, animal, plant, or bacterial, have double-stranded DNA (dsDNA) as their genetic material. Viruses, on the other hand, have genomes, or genetic material, that can be composed of DNA or RNA (but not both). Genomes are not necessarily double-stranded, either; different virus types can also have single-stranded DNA (ssDNA) genomes, and viruses with RNA genomes can be single-stranded or double-stranded. Any particular virus will only have one type of nucleic acid genome, however, and so viruses are not encountered that have both ssDNA and ssRNA genomes, for example. Similarly to how the size of the virus particle varies significantly, the genome size can also vary greatly from virus to virus. A typical virus genome falls in the range of 7000-20,000 base pairs (bp) (7-20 kilobase pairs (kb)). Smaller-sized virions will naturally be able to hold less nucleic acid than larger virions, but large viruses do not necessarily have large genomes. While most viruses do not contain much nucleic acid, some dsDNA viruses have very large genomes: herpesviruses Virion size: Getting Smaller 1000 millimeters (mm) in a meter (m) 1 mm = 10 −3 m 1000 micrometers (μm, or microns) in a millimeter 1 μm = 10 −6 m 1000 nanometers (nm) in a micrometer 1 nm = 10 −9 m Virus genome size: Getting Bigger 1000 base pairs (nucleotide pairs, bp) in a kilobase pair (kb) 1 kb = 10 3 bp 1000 kb in a megabase pair (mb) 1 mb = 10 6 bp 1000 mb in a gigabase pair (gb) 1 gb = 10 9 bp have genomes that are 120-200 kb in total, and the very large pandoraviruses mentioned previously have the largest genomes: up to 2.5 million bases, rivaling the genome size of many bacteria! In comparison, eukaryotic cells have much larger genomes: a red alga has the smallest known eukaryotic genome, at 8 million base pairs; a human cell contains over 3 billion nucleotides in its hereditary material; the largest genome yet sequenced, at over 22 billion base pairs, is that of the loblolly pine tree. The infectious virus particle must be released from the host cell to infect other cells and individuals. Whether dsDNA, ssDNA, dsRNA, or ssRNA, the nucleic acid genome of the virus must be protected in the process. In the extracellular environment, the virus will be exposed to enzymes that could break down or degrade nucleic acid. Physical stresses, such as the flow of air or liquid, could also shear the nucleic acid strands into pieces. In addition, viral genomes are susceptible to damage by ultraviolet radiation or radioactivity, much in the same way that our DNA is. If the nucleic acid genome of the virus is damaged, then it will be unable to produce progeny virions. In order to protect the fragile nucleic acid from this harsh environment, the virus surrounds its nucleic acid with a protein shell, called the capsid, from the Latin capsa, meaning "box." The capsid is composed of one or more different types of proteins that repeat over and over again to create the entire capsid, in the same way that many bricks fit together to form a wall. This repeating structure forms a strong but slightly flexible capsid. Combined with its small size, the capsid is physically very difficult to break open and sufficiently protects the nucleic acid inside of it. Together, the nucleic acid and the capsid form the nucleocapsid of the virion (Fig. 2.2) . Remember that the genomes of most viruses are very small. Genes encode the instructions to make proteins, so small genomes cannot encode many proteins. It is for this reason that the capsid of the virion is composed of one or only a few proteins that repeat over and over again to form the structure. The nucleic acid of the virus would be physically too large to fit inside the capsid if it were composed of more than just a few proteins. In the same way that a roll of magnets will spontaneously assemble together, capsid proteins also exhibit self-assembly. The first to show this were H. Fraenkel-Conrat and Robley Williams in 1955. They separated the RNA genome from the protein subunits of tobacco mosaic virus, and when they put them back together in a test tube, infectious virions formed automatically. This indicated that no additional information is necessary to assemble a virus: the physical components will assemble spontaneously, primarily held together by electrostatic and hydrophobic forces. Most viruses also have an envelope surrounding the capsid. The envelope is a lipid membrane that is derived from one of the cell's membranes, most often the plasma membrane, although the envelope can also come from the cell's endoplasmic reticulum, Golgi complex, or even the nuclear membrane, Describe the common characteristics of viruses. depending upon the virus. These viruses often have proteins, called matrix proteins, that function to connect the envelope to the capsid inside. A virus that lacks an envelope is known as a nonenveloped or naked virus ( Fig. 2. 3). Each virus also possesses a virus attachment protein embedded in its outermost layer. This will be found in the capsid, in the case of a naked virus, or the envelope, in the case of an enveloped virus. The virus attachment protein is the viral protein that facilitates the docking of the virus to the plasma membrane of the host cell, the first step in gaining entry into a cell. Each virus possesses a protein capsid to protect its nucleic acid genome from the harsh environment. Virus capsids predominantly come in two shapes: helical and icosahedral. The helix (plural: helices) is a spiral shape that curves cylindrically around an axis. It is also a common biological structure: many proteins have sections that have a helical shape, and DNA is a double-helix of nucleotides. In the case of a helical virus, the viral nucleic acid coils into a helical shape and the capsid proteins wind around the inside or outside of the nucleic acid, forming a long tube or rod-like structure ( Fig. 2.4) . The nucleic acid and capsid constitute the nucleocapsid. In fact, the protein that winds around the nucleic acid is often called the nucleocapsid protein. Once in the cell, the helical nucleocapsid uncoils and the nucleic acid becomes accessible. Virus capsids are held together by some of the same bonds that are found in living organisms. Rarely are covalent bonds found in capsids; these are the strongest of bonds that are formed when atoms share electrons with each other. Electrostatic interactions, which are weaker than covalent bonds, occur when two atoms are oppositely charged and therefore attracted to each other, hence the saying "opposites attract." Ionic bonds are electrostatic interactions because they occur between oppositely charged atoms, or ions. Hydrogen bonds are also weak electrostatic forces that occur between slightly charged atoms, usually between hydrogen (slightly positively charged) and another atom that is partially negatively charged, such as oxygen. Van der Waals forces are weak interactions that occur when an atom becomes slightly charged due to random asymmetry of its electrons. The properties of water also contribute to virus assembly and attachment to cells. Water is a polar molecule, meaning that the molecule has two distinct ends, much like a battery or magnet has a positive and a negative end. Molecules that do not have distinct ends are termed nonpolar. Other polar molecules are attracted to water, since water is polar too. These molecules are hydrophilic, meaning "water loving." Molecules that are nonpolar are not attracted to water and are hydrophobic, or "water avoiding." Hydrophobic molecules will therefore aggregate with each other in an aqueous solution. This explains the phenomenon of oil (nonpolar) not mixing with water (polar). Matrix Protein Envelope Naked Virus Enveloped Virus The capsid of an enveloped virion is wrapped with a lipid membrane derived from the cell. Virus attachment proteins located in the capsid or envelope facilitate binding of the virus to its host cell. There are several perceived advantages to forming a helical capsid. First, only one type of capsid protein is required. This protein subunit is repeated over and over again to form the capsid. This structure is simple and requires less free energy to assemble than a capsid composed of multiple proteins. In addition, having only one nucleocapsid protein means that only one gene is required instead of several, thereby reducing the length of nucleic acid required. Because the helical structure can continue indefinitely, there are also no constraints on how much nucleic acid can be packaged into the virion: the capsid length will be the size of the coiled nucleic acid. Helical viruses can be enveloped or naked. The first virus described, tobacco mosaic virus, is a naked helical virus. In fact, most plant viruses are helical, and it is very uncommon that a helical plant virus is enveloped. In contrast, all helical animal viruses are enveloped. These include well-known viruses such as influenza virus, measles virus, mumps virus, rabies virus, and Ebola virus ( Fig. 2 .5). Of the two major capsid structures, the icosahedron is by far more prevalent than the helical architecture. In comparison to a helical virus where the capsid proteins wind around the nucleic acid, the genomes of icosahedral viruses are packaged completely within an icosahedral capsid that acts as a protein shell. Initially these viruses were thought to be spherical, but advances in electron microscopy and X-ray crystallography revealed these were actually icosahedral in structure. An icosahedron is a geometric shape with 20 sides (or faces), each composed of an equilateral triangle. An icosahedron has what is referred to as 2-3-5 symmetry, which is used to describe the possible ways that an icosahedron can rotate around an axis. If you hold an icosahedral die in your hand, you will notice there are different ways of rotating it ( Fig. 2.6 ). Let's say you A helix is mathematically defined by two parameters, the amplitude and the pitch, that are also applied to helical capsid structures. The amplitude is simply the diameter of the helix and tells us the width of the capsid. The pitch is the height or distance of one complete turn of the helix. In the same way that we can determine the height of a one-story staircase by adding up the height of the stairs, we can figure out the pitch of the helix by determining the rise, or distance gained by each capsid subunit. A staircase with 20 stairs that are each 6 inches tall results in a staircase of 10 feet in height; a virus with 16.3 subunits per turn and a rise of 0.14 nm for each subunit results in a pitch of 2.28 nm. This is the architecture of tobacco mosaic virus. looked straight on at one of the edges of the icosahedron and poked an imaginary pencil through the middle of that edge. Your pencil would be right in the middle of a triangle facing up and a triangle facing down. If you rotate the icosahedron clockwise, you will find that in 180 degrees you encounter the same arrangement (symmetry): a triangle facing up and a triangle facing down. Continuing to rotate the icosahedron brings you back to where you began. This is known as the twofold axis of symmetry, because as you rotate the shape along this axis (your pencil), you encounter your starting structure twice in one revolution: once when you begin, and again when rotated 180 degrees. On the other hand, if you put your pencil axis directly through the center one of the small triangle faces of the icosahedron, you will encounter the initial view two additional times as you rotate the shape, for a total of three times. This is the threefold axis. Similarly, if your pencil axis goes through a vertex (or tip) of the icosahedron, you will find symmetry five times in one rotation, forming the fivefold axis. It is for this reason that an icosahedron is known to have 2-3-5 symmetry, because it has twofold, threefold, and fivefold axes of symmetry. This terminology is useful when dealing with an icosahedral virus because it can be used to indicate specific locations on the virus or where the virion has interactions with the cell surface. For instance, if a virus interacts with a cell surface receptor at the threefold axis, then you know this interaction occurs at one of the faces of the icosahedron. A protein protruding from the capsid at the fivefold axis will be found at one of the vertices (tips) of the icosahedron. All of the illustrations of viruses in Fig. 2 .7 are viewed on the twofold axis of symmetry. Viral proteins form each face (small triangle) of the icosahedral capsid. Viral proteins are not triangular, however, and so one protein subunit alone is not sufficient to form the entire face. Therefore, a face is formed from at least three viral protein subunits fitted together (Fig. 2.8) . These can all be the same protein, or they can be three different proteins. The subunits together form what is called How many twofold axes of symmetry are found in one icosahedron? How about the number of threefold or fivefold axes? How many faces, edges, and vertices are found in an icosahedron? the structural unit. The structural unit repeats to form the capsid of the virion. But how can some viruses form very large icosahedral capsids? The answer is repetition. The structural unit can be repeated over and over again to form a larger icosahedron side. The number of structural units that creates each side is called the triangulation number (T), because the structural units form the triangle face of the icosahedron. In a T = 1 virus, only one structural unit forms each icosahedron face (Fig. 2.8) . In a T = 4 virus, four structural units form the face. Sometimes the structural unit overlaps from one face to another: in a T = 3 virus, three total structural units form the face, although this occurs as six halfunits (half of each structural unit forms part of an adjacent face). Similarly, the structural units of a T = 7 virus are also slightly skewed, compared to the triangle face. The geometry and math involved with icosahedral capsid structure can be complex, and only the very basics are described here. In any case, by increasing the number of identical structural units on each face, the icosahedron can become progressively larger without requiring additional novel proteins to be produced. Some viruses have triangulation numbers over 25, even! The proteins that compose the structural unit may form three dimensional structures known as capsomeres that are visible in an electron micrograph. In icosahedral viruses, capsomeres generally take the form of pentons (containing five units) or hexons (containing six units) that form a visible pattern on the surface of the icosahedron (See Fig. 13.11 for an example). Capsomeres are morphological units that arise from the interaction of the proteins within the repeated structural units. Why does the icosahedral virus structure appear so often? Research has shown that proteins forming icosahedral symmetry require lesser amounts of energy, compared to other structures, and so this structure is evolutionarily favored. Many viruses that infect animals are icosahedral, including human papillomavirus, rhinovirus, hepatitis B virus, and herpesviruses (Fig. 2.9 ). Like their helical counterparts, icosahedral viruses can be naked or enveloped, as well. The type of viral nucleic acid (dsDNA, ssDNA, dsRNA, and ssRNA) does not correlate with the structure of the capsid; icosahedral viral capsids can contain any of the nucleic acid types, depending upon the virus. The majority of viruses can be categorized as having helical or icosahedral structure. A few viruses, however, have a complex architecture that does not strictly conform to a simple helical or icosahedral shape. Poxviruses, geminiviruses, and many bacteriophages are examples of viruses with complex structure (Fig. 2.10) . Poxviruses, including the viruses that cause smallpox or cowpox, are large oval or brick-shaped particles 200-400 nm long. Inside the complex virion, a dumbbell-shaped core encloses the viral DNA and is surrounded by two "lateral bodies," the function of which is currently unknown. The geminiviruses also exhibit complex structure. As their name suggests, these plantinfecting viruses are composed of two icosahedral heads joined together. Bacteriophages, also known as bacterial viruses or prokaryotic viruses, are viruses that infect and replicate within bacteria. Many bacteriophages also have complex structure, such as bacteriophage P2, which has an icosahedral head, containing the nucleic acid, attached to a cylindrical tail sheath that facilitates binding of the bacteriophage to the bacterial cell. The classification of viruses is useful for many reasons. It allows scientists to contrast viruses and to reveal information on newly discovered viruses by comparing them to similar viruses. It also allows scientists to study the origin of viruses and how they have evolved over time. The classification of viruses is not simple, however-there are currently over 2800 different viral species with very different properties! One classification scheme was developed in the 1970s by Nobel laureate David Baltimore. The Baltimore classification system categorizes viruses based on the type of nucleic acid genome and replication strategy of the virus. The system also breaks down single-stranded RNA viruses into those that are positive strand (+) and negative strand (−). As will be further discussed in the next chapter, positive-strand (also positive-sense or plus-strand) RNA is able to be immediately translated into proteins; as such, messenger RNA (mRNA) in the cell is positive strand. Negative-strand (also negative-sense or minusstrand) RNA is not translatable into proteins; it first has to be transcribed into positive-strand RNA. Baltimore also took into account viruses that are able to reverse transcribe, or create DNA from an RNA template, which is something that cells are not capable of doing. Together, the seven classes are There are a variety of ways by which viruses could be classified, however, including virion size, capsid structure, type of nucleic acid, physical properties, host species, or disease caused. Because of this formidable challenge, the International Committee on Taxonomy of Viruses (ICTV) was formed and has been the sole body charged with classifying viruses since 1966. Taxonomy is the science of categorizing and assigning names (nomenclature) to organisms based on similar characteristics, and the ICTV utilizes the same taxonomical hierarchy that is used to classify living things. It is important to note that viruses, since they are not alive, belong to a completely separate system that does not fall under the tree of life. Whereas a living organism is classified using domain, kingdom, phylum, class, order, family, genus, and species taxa (singular: taxon), or categories, viruses are only classified using order, family, genus, and species (Table 2 .1). The ICTV classifies viruses based upon a variety of different characteristics with the intention of categorizing the most similar viruses with each other. The chemical and physical properties of the virus are considered, such as the type of nucleic acid or number of different proteins encoded by the virus. DNA technologies now allow us to sequence viral genomes relatively quickly and easily, allowing scientists to compare the nucleic acid sequences of two viruses to determine how closely related they are. Other virion properties are also taken into account, including virion size, capsid shape, and whether or not an envelope is present. The taxa of viruses that infect vertebrates are shown in Fig. 2 .11; notice that some families are not yet classified into orders (refer to Table 2 .1 for a refresher on how to distinguish the taxa by their suffixes). Also note the size difference between viruses of different families. Currently, the ICTV has categorized seven orders of viruses ( Table 2 .2) that contain a total of 103 families classified within them. Seventy-seven virus families, however, have yet to be assigned to an order, including notable viruses such as the retroviruses, papillomaviruses, and poxviruses. New orders have been proposed, and it is likely that more will be created as the taxonomical process continues. Herpesvirales dsDNA viruses of vertebrates and invertebrates; from Greek herpes, meaning "creeping" or "spreading" (describing the rashes of these viruses). Ligamenvirales dsDNA viruses that infect the domain Archaea; from Latin ligamen, meaning "thread" or "string" (describing the linear structure of the viruses). Newest order, created in 2012. "Negative-strand" ssRNA viruses of vertebrates, invertebrates, and plants; name derives from Latin for "one negative," referring to the single negative-strand RNA genome. Was the first order created, in 1990. "Positive-strand" ssRNA viruses of vertebrates and invertebrates; from Latin nidus meaning "nest" because they encode several proteins nested within one piece of mRNA. Picornavirales "Positive-strand" ssRNA viruses of vertebrates, invertebrates, and plants; from pico (small) + RNA + virales (viruses). "Positive-strand" ssRNA viruses of plants and invertebrates; Tymo is an acronym standing for Turnip Yellow Mosaic virus, found within this order. ds, Double-stranded; ss, single-stranded. Section 2.1 Common Characteristics of Viruses l Viruses are small. Most viruses are in the range of 20-200 nm, although some viruses can exceed 1000 nm in length. A typical bacterium is 2-3 μM in length; a typical eukaryotic cell is 10-30 μM in diameter. l Viruses are obligate intracellular parasites and are completely dependent upon the cell for replication. Unlike cells that undergo mitosis and split in two, viruses completely disassemble within the cell and new virions (infectious particles) are assembled de novo from newly made components. l While living things have dsDNA genomes, the genetic material of viruses can be composed of DNA or RNA, and single-or double-stranded. Most virus genomes fall within the range of 7-20 kb, but they range from 3 kb to over 2 mb. Section 2.2 Structure of Viruses l The simplest viruses are composed of a protein capsid that protects the viral nucleic acid from the harsh environment outside the cell. Global structural changes in hepatitis B virus capsids induced by the assembly effector HAP1 Involvement of the cellular phosphatase DUSP1 in vaccinia virus infection 4.6 A Cryo-EM reconstruction of tobacco mosaic virus from images recorded at 300 keV on a 4k × 4k CCD camera Virus taxonomy The structure of human parvovirus B19 Structure of dengue virus: implications for flavivirus organization, maturation, and fusion X-ray crystallographic structure of the Norwalk virus capsid Principles of virus structural organization Ambient occlusion and edge cueing to enhance real time molecular visualization Virus species, a much overlooked but essential concept in virus classification l Virus capsids are predominantly one of two shapes, helical or icosahedral, although a few viruses have a complex architecture. In addition, some viruses also have a lipid membrane envelope, derived from the cell. All helical animal viruses are enveloped. l Helical capsid proteins wind around the viral nucleic acid to form the nucleocapsid. A helix is mathematically defined by amplitude and pitch. l An icosahedron is a geometric shape with 20 sides, each composed of an equilateral triangle. The sides are composed of viral protein subunits that create a structural unit, which is repeated to form a larger side and the other sides of the icosahedron. The triangulation number refers to the number of structural units per side. l The Baltimore classification system categorizes viruses based upon the type and replication strategy of the nucleic acid genome of the virus. There are seven classes. l The ICTV was formed to assign viruses to a taxonomical hierarchy. The taxa used for classifying viruses are order, family, genus, and species. Because they are not alive, viruses are not categorized within the same taxonomical tree as living organisms. Explain what 2-3-5 symmetry is, pertaining to an icosahedron. 5. What is a structural unit? In a T = 3 virus that has three subunits per structural unit, how many total subunits form the capsid? 6. List the seven groups of the Baltimore classification system. 7. What taxa are used to classify viruses? How does this differ from the classification of a living organism? 8. What viral properties are used to classify viruses?