key: cord-303265-v6ci69n0 authors: Domingo, Esteban title: Introduction to virus origins and their role in biological evolution date: 2019-11-08 journal: Virus as Populations DOI: 10.1016/b978-0-12-816331-3.00001-5 sha: doc_id: 303265 cord_uid: v6ci69n0 Viruses are diverse parasites of cells and extremely abundant. They might have arisen during an early phase of the evolution of life on Earth dominated by ribonucleic acid or RNA-like macromolecules, or when a cellular world was already well established. The theories of the origin of life on Earth shed light on the possible origin of primitive viruses or virus-like genetic elements in our biosphere. Some features of present-day viruses, notably error-prone replication, might be a consequence of the selective forces that mediated their ancestral origin. Two views on the role of viruses in our biosphere predominate; viruses considered as opportunistic, selfish elements, and viruses considered as active participants in the construction of the cellular world via the lateral transfer of genes. These two models have a bearing on viruses being considered predominantly as disease agents or predominantly as cooperators in the shaping of differentiated cellular organisms. To approach the behavior of viruses acting as populations, we must first examine the diversity of the present-day biosphere and the physical and biological context in which primitive viral forms might have arisen. Evolution pervades nature. Thanks to new theories and to the availability of powerful instruments, new experimental procedures, and increasing computing powerd which together constitute the very roots of scientific progressdwe know that the physical and biological worlds are constantly evolving. Several classes of energy have gradually shaped matter and living entities, basically as the outcome of random events and Darwinian natural selection in its broadest sense. The identification of DNA as the genetic material and the advent of genomics in the second half of the twentieth century unveiled an astonishing degree of diversity within the living world that derives mainly from combinations of four classes of nucleotides. Biodiversity, a term coined by O. Wilson in 1984 and emphasized by T. Lovejoy and others, is a feature of all living beings, be differentiated multicellular organisms, singlecell organisms, or subcellular genetic elements, among them the viruses. Next-generation sequencing methods developed at the beginning of the 21st century allow thousands of sequences from the same biological sample (a microbial community in a soil or ocean sample, a tumor, or an infected host) to be determined. These procedures have documented the presence of myriads of variants in a "single biological entity" or in "communities of biological entities." Differences extend to individuals that belong to the same biological group, be it Homo sapiens, Drosophila melanogaster, Escherichia coli, or human immunodeficiency virus type 1 . No exceptions have been described. Diversity is extensive and not restricted to the genotypic level. It also affects phenotypic traits. During decades, in the first half of the twentieth century, population genetics had as one of its tenets that genetic variation due to mutation had for the most part been originated in a remote past. It was generally thought that the presentday diversity was essentially brought about by the reassortment of chromosomes during sexual reproduction. This view was weakened by the discovery of extensive genetic polymorphisms, first in Drosophila and humans, through secondary analyses of electrophoretic mobility of enzymes, detected by in situ activity assays to yield zymograms that were displayed as electromorphs. These early studies on allozymes were soon extended to other organisms. Assuming that no protein modifications had specifically occurred in some individuals, the results suggested the presence of several different (allelic) forms of a given gene among individuals of the same species, be it humans, insects, or bacteria. In the absence of information on DNA nucleotide sequences, the first estimates of heterogeneity from the numbers of electromorphs were collated with the protein sequence information available. An excellent review of these developments (Selander, 1976 ) ended with the following premonitory sentence on the role of molecular biology in unveiling evolutionarily relevant information: "Considering the magnitude of this effect, we may not be overfanciful to think that future historians will see molecular biology more as the salvation for than, as it first seemed, the nemesis of evolutionary biology." The conceptual break was confirmed and accentuated when molecular cloning and nucleotide sequencing techniques produced genomic nucleotide sequences from multiple individuals of the same biological species. Variety has shaken our classification schemes, opening a debate on how to define and delimit biological "species" in the microbial world. From a medical perspective, it has opened the way to "personalized" medicine, so different are the individual contexts in which disease processes (infectious or other) unfold. Diversity is a general feature of the biological world, with multiple implications for interactions in the environment, and also for human health and disease (Bernstein, 2014 ). Viruses (from the Latin "virus," poison) are no exception regarding diversity. The number of different viruses and their dissimilarity in shape and behavior is astounding. Current estimates indicate that the total number of virus particles in our biosphere reaches 10 32 , exceeding by one order of magnitude the total number of cells. Viruses are found in surface and deep-sea and lake waters, below the Earth surface, in any type of soil, in deserts, and in most environments designated as extreme regarding ionic conditions (i.e., hypersaline) and temperature (thermophilic) (Breitbart et al., 2004; Villarreal, 2005; Lopez-Bueno et al., 2009; Box 1.1) . The viruses that have been studied are probably a minimal and biased representation of those that exist, with at least hundred thousand mammalian viruses awaiting discovery (Anthony et al., 2013; Epstein and Anthony, 2017) . This is because high-throughput screening procedures have only recently become available, and also because prevention of disease has provided the main incentive to study viruses. Disease-associated viruses are those most described in the scientific literature. Current virology poses some general and fascinating questions, which are not easily approachable experimentally. Here are some: • What is the origin of viruses? • Did they originate before or after a cellular world was in place? • What selective forces have maintained multiple viruses as parasites of unicellular and multicellular organisms? • Do viruses exist essentially as selfish parasites, or have they played constructive roles in the biosphere? • Why have a few viral forms not outcompeted most other forms? Or have they? • Have viruses been maintained as modulators of the population numbers of their host species? • Does virus variation play a role in the unfolding of viral disease processes? • Is it quasispecies dynamics that characterize RNA and many DNA viruses, a remnant of an adaptive strategy that presided all life forms in the remote past? • Is the behavior of present-day viruses at the population level only an inheritance of their origins or a present-day necessity? This book deals with some of these issues, mainly those that are amenable to experimental testing. Topics covered include molecular mechanisms of genetic variation, with emphasis on high mutation rates, Darwinian principles acting on viruses, quasispecies dynamics and its implications, consequences for virus-host interactions, fitness as a relevant parameter, experimental model systems in cell culture, ex-vivo and in vivo, long-term virus evolution, the current situation of antiviral strategies to confront quasispecies swarms, and conceptual extensions of quasispecies to nonviral systems. These subjects have as a common thread that Darwinian natural selection has an immediate imprint on them, observable in the time scale of days or even hours. The capacity for rapid evolution displayed by viruses represents an unprecedented and often underappreciated development in biology: the direct observation of Darwinian principles at play within short times. • Total number of viral particles: w 10 32 . This is 10 times more than cells, and they are equivalent to 2 Â 10 8 tons of carbon. • Virus particles in 1 cm 3 of seawater: w 10 8 . • Virus particles in 1 m 3 of air: w 2 Â 10 6 to 40 Â 10 6 . • Rate of viral infections in the oceans: w 1 Â 10 23 /s. • A string with the viruses on Earth would be about w2 Â 10 8 light-years long (w1.9 Â 10 24 m). This is the distance from Earth of the galaxy clusters Centaurus, Hydra, and Virgo. Evolution is defined as a change in the genetic composition of a population over time. In this book evolution will be used in its broader sense to mean any change in the genetic composition of a virus over time, irrespective of the time frame involved, and the transience of the change. We treat as evolution both the genome variation that poliovirus has undergone from the middle of the 20th century to present days and the changes that the same virus undergoes within an infected individual. It is remarkable that only a few decades ago virus evolution (or for that matter microbial evolution in general) was not considered a significant factor in viral pathogenesis. Evolution was largely overlooked in the planning of strategies for microbial disease control. A lucid historical account of the different perceptions of virus evolution, including early evidence of phenotypic variation of viruses, with emphasis on the impact of the complexity of RNA virus populations, was written by J.J. Holland (2006) . Despite having been largely ignored by virologists, the present book was partly stimulated by the conviction that the concept of complexity is pertinent to the understanding of viruses at the population level, having direct connections with viral disease and disease control. Despite pleomorphism in cells and viruses (presence or not of envelopes, and viruses being spherical or even displaying a lemon-like shape, or being elongated), the size of viral particles and their host cells tends to be commensurate with the amount of genetic material that they contain and transmit to progeny ( Fig. 1.1 ). The 2017 report from International Committee on Taxonomy of Viruses (ICTV) divides viruses into 10 Orders, and each of them is subdivided into several families, subfamilies, genera, species, and isolates (https://talk.ictvonline.org/), and each isolate includes a multitude of variants. The task of classifying viruses meets with considerable hurdles and requires periodic revisions by the ICTV, an organization whose role has been essential to provide conceptual order in the vast viral world. One of its objectives is the assignment of newly discovered viruses to the adequate group. A remarkable number of isolates remain unclassified, an echo of the FIGURE 1.1 Representative average diameter values and genome complexity of viruses and some cell types. Diameters are expressed in microns (m), length of DNA in base pairs (bp), and of RNA in nucleotides (nt). Viral genomes can be linear, circular, diploid, segmented, or bipartite (multipartite in general; genome segments encapsidated in separate particles); in the latter case at least two particles, each with one kind of genomic segment, must infect the same cell for progeny production. The bottom boxes describe four groups of viruses according to the type of nucleic acid that acts as replicative intermediate. natural diversity of viruses, even among the limited subset that has been isolated and characterized. Viruses can be divided into two broad groups: those that have RNA as genetic material, termed the RNA viruses, and those that have DNA as genetic material termed the DNA viruses. They can have linear, circular, or segmented genomes of single-stranded or double-stranded nucleic acid ( Fig. 1.1 ). All evidence suggests that the polynucleotide chain (or chains) that constitute the viral genome has all the information to generate infectious progeny in a cell, as evidenced by the production of infectious poliovirus from synthetic DNA copies assembled to represent the genomic nucleotide sequence (Cello et al., 2002) . With regard to the concepts of genome stability versus variation addressed in this book, it is helpful to divide viruses into four groups, depending on whether it is DNA or RNA the type of genetic material, which acts as a replicative intermediate in the infected cell (bottom gray shaded boxes in Fig. 1.1 ). The nucleic acids written in the four schemes are those involved in the flow of genetic information (indicated by arrows), not in gene expression since all of them use messenger RNA (mRNA) for virus-specific protein synthesis. Mistakes in the form of misincorporation of nucleotides during the replication steps indicated by arrows are transmitted to progeny genomes. RNAs produced by transcription to serve solely as mRNAs are essential for gene expression and virus multiplication, but misincorporations in such transcripts are not transmitted to progeny. It could be considered that some mRNA molecules when synthesized from the corresponding RNA or DNA template may acquire mutations and that these mutated molecules (e.g., an mRNA encoding a viral polymerase), when translated, may produce a polymerase with lower copying fidelity that will evoke additional mutations; we will ignore this possibility since a single mRNA molecule should have a rather limited contribution to the overall genetic variation of a replicating virus population with thousands of polymerase and template molecules in the replication complexes (or replication factories). Group 1 (with a replicative scheme abbreviated as RNA/RNA) includes RNA viruses whose genomic replication cycle involves only RNA. They are sometimes called riboviruses. Examples are the influenza viruses, hepatitis A and C viruses, poliovirus, coronaviruses, Ebola virus, foot-and-mouth disease virus, or tobacco mosaic virus, among many other important human, animal, and plant pathogens. Their replication is catalyzed by an RNA-dependent RNA polymerase (RdRp) encoded in the viral genome, often organized as a replication complex with viral and host proteins in cellular membrane structures. Group 2 (RNA/DNA/RNA) comprises the retroviruses [such as HIV-1, the acquired immune deficiency syndrome (AIDS) virus, and several tumor viruses] that retrotranscribe their RNA into DNA. Retrotranscription is catalyzed by reverse transcriptase (RT), an RNA-dependent DNA polymerase encoded in the retroviral genome. It reverses the first step in the normal flow of expression of genetic information from DNA to RNA to protein, once known as the dogma of molecular biology. This enzyme was instrumental in the understanding of cancer, and for genetic engineering and the origin of modern biotechnology. As a historical account of the impact of H. Temin's work (codiscoverer of RT with D. Baltimore), the reader is referred to Cooper et al. (1995) . Retroviruses include a provirus stage in which the viral DNA is integrated into host DNA. When silently installed in cellular DNA, the viral genome behaves mostly as a cellular gene. Group 3 (DNA/DNA) contains most DNA viruses, such as herpesviruses, poxviruses, iridoviruses, and papillomaviruses, and the extremely large viruses of amebae (i.e., Mimivirus, Megavirus, or Pandoravirus, generically termed giant viruses) (La Scola et al., 2008; Colson et al., 2017) . Their replication is catalyzed by a DNA-dependent DNA polymerase either encoded in the viral genome or in the cellular DNA. Cellular DNA polymerases are involved in the replication of DNA viruses that do not encode their own DNA polymerase. Finally, Group 4 (DNA/RNA/DNA) includes viruses which despite having DNA as genetic material, produce an RNA as a replicative intermediate, the most significant examples being the human and animal hepatitis B viruses (HBVs) and the cauliflower mosaic virus of plants, termed hepadnaviruses. Most viruses, from the more complex DNA viruses [i.e., 1200 Kbp (1000 base pairs) for the ameba Mimivirus, 752 Kbp for some tailed bacteriophages, and up to 370 Kbp for poxviruses, iridoviruses, and herpesviruses], the virophages that are parasites of the giant DNA viruses, the simplest DNA viruses (the circular singlestranded 1760 residue DNA of porcine circovirus), RNA bacteriophages (4220 nucleotides of ssRNA for bacteriophage Qb), or subviral elements (viroids, virusoids, satellites, and helperdependent defective replicons) show remarkable genetic diversity. However, RNA viruses that replicate entirely via RNA templates (Group 1 in Fig. 1 .1); retroviruses (Group 2); and the hepadnaviruses (Group 4) display salient genetic plasticity, mainly in the way of a high rate of introduction of point mutations (Chapter 2). Their mutability may be an inheritance of universal flexibility that probably characterized primitive RNA or RNA-like molecules, thought to have populated an ancestral RNA world at an early stage of life on Earth (Section 1.4.2). Thus, the presence of RNA at any place in the replicative schemes (Group 1, 2, and 4 in Fig. 1 .1) implies error-prone replication and the potential of very rapid evolution. "Potential" must be underlined because high error rates do not necessarily result in rapid long-term evolution in nature (Chapter 7). The extent of genetic variation and its biological consequences have been less investigated for DNA viruses than for RNA viruses. The available data suggest that DNA viruses are closer to RNA viruses than suspected only a few years ago, regarding their capacity of variation and adaptation. This is particularly true of the single-stranded DNA viruses of animals and plants. Evolutionary theory predicts that highfidelity polymerase machinery is necessary to maintain the informational stability of complex genomes (those that carry a large amount of genetic information). This necessity accomplished by proofreading-repair and postreplicativerepair activities that assist the replicative DNA polymerases and their cellular and viral DNA progeny. Our current capacity to sample many thousands viral genomes in short times (a trend that is continuously expanding) is revealing an astonishing number of slightly different viral genomes within a single infected host, and even within an organ or within individual cells of an organ! Intrahost diversity of viruses can be the result not only of diversification within the host but also of coinfection with different viruses (or variants of one virus), or an infection that triggers reactivation of a related or unrelated virus from a latent reservoir, or combined effects of these mechanisms. In turn, interhost (long-term) virus diversification can result from selection acting on variants generated by mutation, recombination or reassortment, and random sampling events (independent of selection) within hosts and in host-to-host transmission, or their combined effects (Chapters 2 and 3). A different picture of diversity is obtained by comparing the morphological characteristics of viral particles (termed virions). The hundreds of thousands of bacterial and archaeal viruses that have been recognized can be assigned to as few as 20 morphotypes. The capsids of nonenveloped (naked) viruses display helical or icosahedral symmetry that determines the architecture of the virion. Variation in size and surface protein distributions can be attained from limited protein folds and the same symmetry principles (Mateu, 2013, Fig. 1.2) . Divergent primary amino acid sequences in proteins can fold in closely related structures. The "structural space" available to viruses particles is much more restricted than the "sequence space" available to viral genomes (Abrescia et al., 2012) . Sequence space and its mapping into a phenotypic space are key concepts for the understanding of evolutionary mechanisms (Chapter 3). Three-dimensional structures of entire virions or their constituent proteins can provide an overview of phylogenetic lineages and evolutionary steps in cases in which the information cannot be attained by viral genomics (Ravantti et al., ). Yet, minor genetic modifications that do not affect the phylogenetic position of a virus or the structure of the encoded viral proteins in any substantial manner can nevertheless have major consequences for traits as important as host range or pathogenicity. How such minor changes in viruses can have major biological consequences may relate to the historical role of viruses in an evolving, ancestral, pre-cellular biosphere. To further address this issue, we need to examine how viruses may have originated. This, in turn, begs the question of the origin of life and the possible involvement of viruses in early life development. An understanding of the mechanisms involved in the origin of life may help in penetrating into the origin of previral entities, defined as the precursors of the viruses we isolate in modern times. Different notions on the origin of life have been held in human history, often linked to religious debate. Opinions have ranged from a conviction of the spontaneous and easy generation of life from inanimate materials, or its beginning from a unique and rare combination of small prebiotic molecules, or being the result of a lengthy prebiotic process, or its inevitability as the outcome of the evolution of matter in our universe (or "sets" of universes with the adequate physical parameters, according to some cosmological models). As little as 150 years ago (a time not distant from the discovery of the first viruses), there was a general belief in the spontaneous generation of life. This was somewhat paradoxical because chemists of the seventeenth century divided chemistry into mineral chemistry, vegetal chemistry, and animal chemistry. J.J. Berzelius put together animal and plant chemistry and named the resulting discipline "organic" chemistry, which he distinguished from "inorganic" chemistry (Berzelius, 1806) . He formulated what was known as the central dogma of chemistry: "The generation of organic compounds from inorganic compounds, outside a living organism, is impossible." The classical experiments of L. Pasteur provided definitive proof that, at least under the prevailing conditions on present-day Earth, "life comes from life" (Pasteur, 1861) . He established what was considered the "central dogma of biology": "The generation of a whole living organism from chemical compounds, outside a living organism is impossible." The requirement of life to generate life was, however, extended to the belief that "living" and "nonliving" were two separate categories in the organization of matter, and that organic compounds could be synthesized only by living cells. This doctrine, called vitalism dominated biology for almost a century, and in a modified manner, it continues today regarding the interpretation of mental activity in humans (matter and spirit as "substance dualism"). Historical views on the origin of life have been addressed in several publications (Rohlfing and Oparin, 1972; Bengtson, 1994; de Duve, 2002; Eigen, 2002 Eigen, , 2013 Lazcano, 2010) . Dogmas are generally not to stay. "Vitalism" was shattered by the chemical synthesis of organic compounds from inorganic precursors (urea by F. W€ ohler in 1828, acetic acid by H. Kolbe in 1845, hydrocarbons by D. Mendeleev in 1877, and several other compounds by M. Berthelot in the second half of the 19th century). The evidence that no "vital force" was needed for such syntheses led F.A. Kekul e to write in his classic textbook on organic chemistry published during 1859e60: "We have come to the conviction that . no difference exists between organic and inorganic compounds." From then on, organic chemistry became the chemistry of "carbon compounds." We know now that living entities are made of the same chemical elements 1. Introduction to virus origins and their role in biological evolution found in the mineral world. Of the 118 elements in the periodic table (in 2019), 59 are found in the human body. A key experiment carried out in 1953 by S. Miller, working with H.C. Urey showed that components of biological molecules could be obtained from inorganic precursors. He mimicked the conditions thought to be prevalent in the primitive Earth, and mixed hydrogen (H 2 ), ammonium (NH 3 ), and methane (CH 4 ) in a sealed reactor with an influx of water vapor. Synthesis of a number of organic compounds occurred under the influence of electrical discharges. The de novo synthesized chemicals included amino acids (glycine, alanine, aspartic acid, and glutamic acid), formic, acetic, propionic and fatty acids, cyanide, and formaldehyde (Miller, 1953 (Miller, , 1987 . Several researchers followed the Miller's approach using other starting chemical mixes and confirmed that key components of the macromolecules that are associated with living materials (notably purines, pyrimidines, and amino acids) could be made from precursors, which were abundant in the primitive Earth or its atmosphere. Today, variant versions of Miller's protocol (including additional starting chemicals, aerosol spread of chemicals, freezethaw cycles, different sources of energy, electron beams, etc.) produce interesting information on the synthesis of organic molecules (Dobson et al., 2000; Miyakawa et al., 2002; Bada and Lazcano, 2003; Ruiz-Mirazo et al., 2014) . Intense ultraviolet (UV) irradiation may have contributed to the synthesis of compounds relevant to life: ammonia, methane, ethane, carbon monoxide, formaldehyde, sugars, nitric acid, and cyanide. Complex organic compounds (notably aromatic hydrocarbons and alcohols) are found in interplanetary dust, comets, asteroids, and meteorites, and they can be generated under the effect of cosmic and stellar radiation. Thus, many organic compounds could have been produced within the Earth atmosphere or away from it, and be transported to the Earth surface by meteorites, comets, or rain, to become the building blocks for additional life-prone organic molecules. Places at which peptide bond formation and prebiotic evolution could have been favored are hydrothermal systems and the interface between the ocean and the atmosphere (Chang, 1994; Horneck and Baumstark-Khan, 2002; Ehrenfreund et al., 2011; Parker et al., 2011; Danger et al., 2012; Griffith and Vaida, 2012; Ritson and Sutherland, 2012; Nakashima et al., 2018) . A key issue from prebiotic syntheses is the degree of oxidation of the primitive Earth atmosphere. Records of an early surface environment dated 3.8 billion years ago were found in metasediments of Isua, Greenland. These materials suggest that the surface temperature of the Earth was below 100 C, with the presence of liquid and vapor water, and gases supplied by intense volcanism (CO 2 , SO 2 , and N 2 ). The composition of primitive rocks, together with theoretical considerations, suggest a neutral redox composition of the Earth atmosphere (with relative gas abundances: N 2 , CO 2 > CO [ CH 4 , H 2 O [ H 2 , SO 2 > H 2 S) around the time of the origin of primeval forms of life. The possible presence of a general or localized reducing atmosphere (N 2 , CO > CH 4 > CO 2 , H 2 O, w H 2 , H 2 S > SO 2 ) is still debated, but increasingly viewed as unlikely. In an oxidative atmosphere, yields of amino acids, nucleotides, and sugars would be lower. Either these diminished yields were sufficient to attain critical levels of relevant building blocks, or an earlier reducing atmosphere may have accumulated them, among other possibilities (Trail et al., 2011) . In spite of the validity of "life comes from life" in the current Earth environment, the experimental facts suggest that there is no barrier for the generation of life from nonlife, provided suitable environmental conditions are met. In this line, A.I. Oparin proposed that a "primitive soup" could well have been the cradle of life on Earth, as described in his famous treatise on the origin of life (Oparin, 1.4 Origin of life: a brief historical account and current views 1938), a concept that had already been sketched by C. Darwin. The "protein first" versus "nucleic acid first" for the origin of life is still a contended issue (Falk and Lazcano, 2012) . Although during decades there was a preference for nucleic acids due to their superior capacity for selforganization to perpetuate inheritable messages through base pairings, recently both views have accumulated arguments in their favor. An integrative "metabolism first" view based on mutually catalytic networks of small molecules displaying the capacity to replicate and evolve is gaining adepts (Lancet et al., 2018) . The building blocks of nucleic acids have been more difficult to obtain from primeval chemicals than the building blocks of proteins. Peptides of about 20 amino acids in length could have been easily formed under prebiotic conditions (Fox and Dose, 1992) , and peptides or their derived multimers had a potential to display catalytic activities at a proto-metabolic stage. Peptide amyloids have been proposed as an alternative to nucleic acids for the origin of life in what has been termed the amyloid-word hypothesis (Greenwald et al., 2018) . Interestingly, amyloid aggregates display features of quasispecies, not in the way of mutant distributions found in viruses (Chapter 3) but as alternative protein conformations and conformation heterogeneity that confers functional diversity (see Chapter 10 on the collective behavior of prions). The protein-versus nucleic acid-first alternatives may not be incompatible. Short nucleotide and amino acid polymers might have cooperated to cross a complexity threshold for self-sustained replication to arise and evolve (Kauffman, 1993) . As soon as peptide-or protein-based catalytic activities developed, they had to be coordinated with oligo-or polynucleotide replication. This integration of genotypic information with its phenotypic expression may be achieved through hypercyclic couplings, as proposed by M. Eigen and P. Schuster (Eigen and Schuster, 1979; Eigen, 2013) , to render incipient life forms sustainable. Substantiating the abiotic synthesis of nucleotide-or amino acid-based polymers has been arduous relative to the synthesis of monomeric organic molecules. However, early work by L. Orgel and his colleagues documented that polynucleotides could be synthesized from activated nucleotides in the absence of an enzyme (Miller and Orgel, 1974) , and recent work has established that there are multiple chemical pathways for abiotic and nonenzymatic nucleotide synthesis (Adamala and Szostak, 2013; Ruiz-Mirazo et al., 2014) . Mineral surfaces (clay minerals, zeolites, manganates, hydroxides, etc.) have been proposed as key participants in the origin of life, by providing scaffolds for the synthesis of nucleotide and amino acid polymers (Bernal, 1951; Gedulin and Arrhenius, 1994; Ruiz-Mirazo et al., 2014; Nakashima et al., 2018; Pedreira-Segade et al., 2018) ). The relevant features of clays are their adsorption power, ordered structure, capacity to concentrate organic compounds, and ability to serve as polymerization templates. Adsorption onto mineral surfaces may lower the activation energy of intermolecular reactions. Minerals with an excess positive charge on their surfaces might have played a role in the evolution of RNA-like or RNA-precursor molecules. It has been suggested that mineral-organic complexes (chimeras between materials once thought to belong to unmixable categories) could have been the first living organisms endowed with a genetic program (Cairns-Smith, 1965) . Bond formation and chain extension, which are needed to synthesize nucleic acids and proteins prebiotically, are inhibited in liquid water, thus favoring mineral surfaces as potential biogenic sites. Polymer formation could have also been facilitated by heating and drying, with key involvement of the dried phases in catalysis (Towe, 1994; Horneck and Baumstark-Khan, 2002) . On different clay types, nucleotides and amino acid polymers of dozens of residues 1. Introduction to virus origins and their role in biological evolution have been synthesized, providing a realistic scenario for a transition toward primitive selfreplicating entities (Ferris et al., 1996) . One of the first episodes of Darwinian positive selection could have operated prebiologically through differential surface binding, followed by the selection of autocatalytic heteropolymers (Tkachenko and Maslov, 2018) . We arrive at the paradox that while primitive polymerization reactions might have required dry surfaces, later forms of life evolved in a water-rich environment, with water being the major component of cells. "Origin" and "establishment and evolution" of life have been considered either as linked processes that obey similar rules, or distinct events whose investigation requires dissimilar approaches. A distinction between "origin" and "establishment" will become pertinent also when addressing the origin versus the evolution of viruses later in this chapter, and the mechanisms of viral disease emergence in Chapter 7. A simplified overview of a course of events that led to the origin of biological systems is depicted in Fig. 1.3 . The prebiotic synthesis of potential building blocksdwhich might have been initiated earlier than 5000 million years agodrenders plausible the existence of a pre-RNA era that was then replaced by an RNA world in the late Hadean early Archean periods on Earth. This stage should have been followed by one in which RNA was complemented by DNA as a repository of genetic information (Bywater, 2012) . Polymers other than DNA and RNA are also capable of encoding evolvable inheritable information (Pinheiro et al., 2012; transitions. Time (horizontal lines) is expressed as millions of years before present, counted from the big bang (estimated at w 13,800 million years ago). The time at which plants colonized land (green cone) and major mass extinctions (red triangles) are numbered: 1, end Cretaceous extinction; 2, late Triassic extinction; 3, end Permian extinction; 4, late Devonian extinction; 5, end Ordovician extinction. A proposed sixth extinction presently underway is not indicated. Illustration by C. Perales and E. Domingo from information retrieved from references given in the text. Robertson and Joyce, 2012) . Heterogeneous nucleic acid molecules (including mixtures of ribo-and deoxyribo-polymers) can give rise to functional nucleic acids (Gilbert, 1986; Lazcano, 1994a; Gesteland et al., 2006; Derr et al., 2011; Szostak, 2011; Trevino et al., 2011) . RNA enzymes (ribozymes), such as RNA ligases can evolve from random-sequence RNAs (Joyce, 2004; Sczepanski and Joyce, 2014) . The critical polymerization reaction involves the formation of a phosphodiester bond and release of pyrophosphatedanalogous to the reactions catalyzed by the present-day RdRpsdand represent an incipient, primitive anabolism (Eigen, 1992; Lazcano, 1994a; Orgel, 2002 Orgel, , 2004 Dworkin et al., 2003; Joshi et al., 2011) . In support of a possible link between catalytic RNA activities and solid mineral surfaces in the origin of life (as summarized in Section 1.4.1), the catalytic activity of the hammerhead ribozyme of the Avocado Sun Blotch viroid was maintained when bound to the clay mineral montmorillonite (Biondi et al., 2007) . Minimum requirements for an RNA world would be the presence of ribozymes and mechanisms for the intake of energy-rich molecules (Orgel, 2002 (Orgel, , 2004 . The inherently low copying fidelity of the putative ribozyme polymerases, estimated in 10 À2 to 10 À4 errors per nucleotide copied (Wochner et al., 2011) should have ensured genetic variation for selection to act upon variant RNA molecules. Chirality (the existence of two mirror images or enantiomers of a molecule) poses a challenge for the chemical origin of biological molecules (Caglioti et al., 2011; Ruiz-Mirazo et al., 2014; Sczepanski and Joyce, 2014) . Present-day biological systems use only D-ribose (D from "dextro" or rotation of the plane of polarized light to the right) while chemical condensation reactions produce equal amounts of the D-and L-(levo) forms. Nonenzymatic template-dependent reactions can be inhibited by the incorrect enantiomer. This led to the proposal that analogs devoid of enantiomeric forms, such as glycerol derivatives, could have been the basis of the most primitive genetic systems (Schwartz and Orgel, 1985) . Theoretical studies support the notion that initial achiral conditions can evolve toward chirality, in what has been defined as an extension of punctuated equilibrium to prebiological evolution (Gleiser et al., 2008) . Probably, very little, if anything, remains in our present-day biosphere of a primitive RNA world, let alone traces of the prolonged process that went from chemistry to the first replicating organizations, so profound have been the changes experienced by the Earth and its surroundings for over four billion years (Cantine and Fournier, 2018) . Contemporary catalytic RNAs (found in the ribosome, as part of some protein complexes, and in some viroids), as well as nucleotide-like coenzymes, have been regarded as possible molecular remnants of a primitive RNA world (Lazcano, 1994a) . Some authors consider the possibility that cells whose genetic material is made of RNA (Forterre, 2005) may still hide in some remote sites of our planet (Yarus, 2010) . For a transition from a purely RNA (or RNA-like) world to an extended scenario with the participation of proteins, the presence of a transfer RNA (tRNA) quasispecies and the generation of a genetic code endowed with evolutionary potential must have been critical (Eigen, 1992; Koonin and Novozhilov, 2017) . Initial theories of how the genetic code might have arisen were put forward by F. Crick, L. Orgel, and C. Woese in the middle of the twentieth century. Main proposals included a stereochemical fit between some amino acids and the corresponding bases (or codons to be), progressive evolution from a one nucleotide to a three nucleotides code, gradual incorporation of amino acids in the coding system, and the frozen accident model of codon universality (for a review of early concepts, see Crick, 1968) . New insights on the code origin have come from integrating the knowledge of the mechanisms of protein synthesis with likely events in the RNA world. Primitive tRNA quasispecies (existing around 4000 million years ago, late Hadean, early Archean, Fig. 1.3 ) and tRNA aminoacylating ribozymes should have evolved to fit the genetic code, which was later expanded in coevolution with the translation machinery (Szathmary, 1999; Rodin et al., 2011; Caetano-Anolles et al., 2013) . The age of the genetic code has been estimated in 3.8 AE 0.5 thousand million years (Eigen, 2013) , and the present code structure is remarkably redundant. Effects of redundancy are to minimize the deleterious effects of mutations, and to contribute to flexibility of functional involvement of mRNA secondary structure (Shabalina et al., 2006; Koonin and Novozhilov, 2017 ; see also Chapter 2). The advent of DNA as an informational macromolecule that was physically more stable than RNA opened the way to an increase of complexity (in the sense of the amount of genetic information) of the genetic material. DNA allowed integration of modules to form the first "chromosomes" and transcriptional regulation prior to protein expression. As in current evolutionary virology, perhaps the most challenging problem to understand early life is to identify the selective constraints that influenced the course of events. In contrast to the present-day environmental changes confronted by viruses (multiple and complex, but amenable to experimentation; Chapters 4 and 6), the conditions that permitted primitive genetic entities to acquire expanded coding and signaling capacities defy our imagination. Some aspects are considered next. The most salient attributes of living matter are reproduction, evolvability, energy conversion (metabolism), and compartmentalization. What selective forces might have led to the integration of these features? Concerning reproduction, there must have been a critical transition from the absence of any inheritable instruction (despite the presence of primitive polymers) to the first molecules endowed with "inheritable" information, for example, a macromolecule capable of self-copying. Such a molecule should have had an immense selective advantage over surrounding macromolecular peers devoid of the capacity for reproduction. The transition from "no inheritable information" to "inheritable information" is essential for the origin of life. Current evidence suggests that the process that allowed such critical transformation was slow and inaccurate. Slow because the catalytic RNAs selected in the laboratory are about 10 million-fold slower than most protein enzymes (Jeffares et al., 1998; Yarus, 2010) . Inaccurate because preenzymatic nucleotide polymerization would rarely display error rates below 10 À1 to 10 À2 mutations per nucleotide copied (Inoue and Orgel, 1983) . No "predators" such as degrading enzymes were present to impede a slow accumulation of variant replicating molecules. The slow sequence exploration might have taken place in microenvironments shielded from damaging radiation until points in sequence space compatible with self-organization, replication, and adaptation were encountered. The adaptive potential of mutant distributions of polynucleotides is a key concept in the quasispecies theory of the origin of life (Eigen and Schuster, 1979) , and a signature of present-day viruses (Chapter 3). Replicative inaccuracy and heterogeneity appear as recurrent requirements for the major transitions and adaptability of the forms generated in the course of prebiological and biological evolution. Once a primitive molecular "memory" was implemented, in the words of M. Eigen: " ..information generates itself in feedback loops via replication and selection, the objective being 'to be or not to be'" (Eigen, 1994 (Eigen, , 2013 . In those times, this was the only simple requirement: to be or not to be. This singular transition resulted in the first replicating entities that are also termed replicators or replicons. They were selected for replicability, stability, and evolvability with trade-offs 1.4 Origin of life: a brief historical account and current views (acquisition of benefits for one of the three traits at some cost for another trait) likely play a role at this stage (see Chapter 4 for trade-offs in virus evolution). Optimization of primitive replicons should have been facilitated by dual genotypic and phenotypic features in the same molecule (Eigen, 2013) . That RNA itself is both genotype and phenotype is a feature of present-day RNA viruses. That is, the genomic RNA determines phenotypic traits, independently of its proteincoding activity, for example, through its involvement in RNA-RNA and RNA-protein interactions. "Priming" of polynucleotide synthesis, in the sense as we know it today, should not have been a limitation since circular RNA or RNAlike molecules could fold partially to prime their own copying. The term "replicon" currently refers to any genetic element that encodes sufficient information to be copied (i.e., viruses, plasmids, etc.), even if the copying is carried out by (or in conjunction with) elaborate celldependent machinery. Virtual replicons are used in computer simulations to learn about the dynamics of natural living systems (Adami, 1998; Eigen, 2013) . The environment in which primitive replicons had to self-organize about 4000 million years ago was very different from the environment we have today on Earth. The sun was about 25% e30% less luminous than the present-day sun, yet it produced more ultraviolet (UV) light. Due to the absence of oxygen and ozone layer, the UV radiation that reached the Earth was 10-to 100-fold more intense than today, with the difference being accentuated for the radiation in the 200e280 nm range. Studies of the conversion of UV radiation into DNA-damage equivalents suggest a two to three logarithm larger biologically relevant UV radiation during the time of the putative RNA world as compared with today's radiation (Canuto et al., 1982; Chang, 1994; Horneck and Baumstark-Khan, 2002) . In such an environment, radiationrelated mutational input could have had drastic effects on replicating entities in ways that can be only roughly anticipated from the present-day chemistry. Even the simplest present-day RNA genetic systems, despite their small target size, would undergo severe radiation damage. Reconstruction of protein enzyme-free nucleic acid synthesis under the radiation conditions prevalent on Earth during the RNA world development, during late Hadean and early Archean eras, offers a fascinating challenge and opportunity of experimental research for the rising field of Astrobiology. It has been considered that the time elapsed since the Earth attained a life-friendly environment until protocells arose (from about 4500 million to about 3500 million years ago, Fig. 1. 3) was insufficient for life development. This led to the panspermia theory, which proposes that life has an extraterrestrial origin. Panspermia in different forms has been defended by noted scientists, such S. Arrhenius in the early twentieth century and later by F. Crick and L. Orgel (discussed by de Duve, 2002) . In addition to the time estimates for life generation being arbitrary, our present understanding of how error-prone replication can facilitate evolvability and exploration of novel biological functions (Chapters 2 and 3) converts a 1 million-year time period in a long time for life to originate and initiate multiple branches for its development. Energy conversion is an essential feature of life. The primitive precellular organizations might have obtained energy either from organic molecules captured from the external environment (heterotrophy) or from metabolites they synthesized endogenously using external energy (autotrophy). One line of thought considers that it is more likely that the first cells were heterotrophs, and that only later they evolved toward autotrophy, in the form of photosynthesis, which represented a major transition in the repertoire of biosynthetic pathways (Nakashima et al., 2018) . Fermentation reactions were likely the first ones exploited to break energy-rich bonds, as a source of energy for primordial biochemical reactions. An alternative view is that the first cellular organism was an autotroph, in particular, a chemoautotroph (also termed lithotroph) that used inorganic compounds to obtain energy. One of these proposals is that the formation of pyrite from hydrogen sulfide was used by primitive cells as an energy source, resulting from reactions such as FeS þ H 2 S / FeS 2 þ 2H þ þ 2e À (W€ achtersh€ auser, 1994). Positive charges on pyrite crystals could accumulate negatively charged molecules (e.g., the products of CO 2 fixation) and undergo reductive reactions. The system might have selected surface-bound polymers rather than monomers, and given rise to biochirality (selection of one enantiomeric form over another) because of the chiral structure of pyrite (W€ achtersh€ auser, 1988). Then, an evolution toward an Archean carbon-fixation cycle would occur, a precursor of the metabolic cycles found in present-day archaeal and bacterial organisms (Ruiz-Mirazo et al., 2014) . The picture may have been more complex, as judged by the success of mixotrophic organisms which are capable of switching from phototrophy (lightmediated break-down of CO 2 for their metabolism) to heterotrophy in some extreme environments of the present-day Earth (Laybourn-Parry and Pearce, 2007). The integration of early replication systems and metabolism was probably favored by some compartmentalization of replicative-metabolic units through lipid bilayers (Carrara et al., 2012; Stano et al., 2013; Ruiz-Mirazo et al., 2014; Hanczyc and Monnard, 2017) . Here a second decisive positive selection event might have entered the scene at the stage of formation of proto-cells and the first individual cells (3600e3200 million years ago, Fig. 1.3 ; Eigen, 1992) . Lipid bilayers endowed with a splitting capacity should have been strongly selected at this stage of life development because of their power to spread. This underlines the concept that Darwinian selection need not be associated exclusively with template-copying processes. Membrane traffic and reorganizations are essential for the life cycle of many present-day viruses (Huotari and Helenius, 2011) . Virus particle stability and capacity to spread, associated with membrane capture and interactions, might have derived from the early genomes that exploited membrane-based organelles to achieve functional diversification. Compartmentalization is considered one of the key developments to initiate a cell-based life (Morowitz, 1992; Hanczyc and Monnard, 2017). Despite obvious difficulties in reproducing physical and chemical processes that were likely involved in the origin of life, there is sufficient evidence to render probable that the most primitive organizations that we would now consider as "living" resulted from the assembly of simple organic compounds that attained a required level of complexity (Kauffman, 1993) . The various facets that distinguish living from inanimate matter are recapitulated in the definitions of life that scientists from different backgrounds in physics, chemistry, or biology have proposed (Box 1.2). Some definitions underline entropy requirements, while others emphasize self-organization, evolvability, metabolism, or regulated complexity. Their disparity reflects the intricacies of a rather enigmatic unfolding of matter. A. Lazcano summarized the current situation in the arduous search of the origin of life: "There are still unsolved problems but they are not completely shrouded in mystery, and this is no minor scientific achievement. Why should we feel disappointed by our inability to even foresee the possible answers to these luring questions? As the Greek poet Konstantinos Kevafis once 1.4 Origin of life: a brief historical account and current views wrote, Odysseus should be grateful not because he was able to return home, but on what he learned on his way back to Ithaka. It is the journey that matters" (Lazcano, 1994b ; for a general overview of the history of life on Earth, centered in paleontology, see Cowen, 2005) . After this brief survey of the origin of life, we can now examine theories on how, when, and why viruses arose and became active actors in our biosphere. Although not in a linear fashion, the number of nucleotides or base pairs in the genetic materialdthat presumably reflects the amount of genetic information relevant to confer phenotypic traitsdincreased as evolution led to differentiated organisms. The major theories of the origin of viruses are divided into two opposite categories: those that attribute virus origins to the early development of life, and those that propose that viruses arose when a cellular life was already in place ( Fig. 1.4) . These two broad views that we may term "viruses without cells" and "viruses from cells" are not irreconcilable, although reconstruction of ancestral developments is challenging. They can be divided into five main theoriesdnot all independent or mutually exclusived which are summarized next. • Life is the property of a system that continuously draws negative entropy (maintains orderliness), and delays decay into thermodynamic equilibrium (Schr€ odinger, 1944). • Life is an expected, collectively self-organized property of catalytic polymers (Kauffman, 1993) . • Life is a property of any population of entities possessing those properties that are needed if the population is to evolve by natural selection (Maynard Smith and Szathm ary, 1999). This answer is not a tautology, as it allows many attributes to be excluded from the definition of life (de Duve, 2002). • Life is descent with modification. Replication is life's ultimate chemical and physical survival strategy (Yarus, 2010 ). • Life is a self-sustained chemical system capable of Darwinian evolution (NASA Astrobiology Institute). • Living organisms are metabolic-replicating systems composed of molecules and cells which are subject to spontaneous changes in structure and function due to mutations (Demetrius, 2013) . • Life in three statements: (1) Life is not represented by any fundamental physical structure. (2) Life is an overall organization that is governed by functional rather than by structural principles. (3) In order for life to come about, there must exist some physical principle that controls complexity (Eigen, 2013) . • Life, whatever else it may be, is certainly a regularity among material processes (Eigen, 2013 ). • Viruses were involved in the origin of life. Some viruses are the descendants of primitive RNA or RNA-like replicons that preceded cellular forms. Because of their limited genetic complexity, RNA viruses and subviral RNA elements have been considered possible descendants of the primitive replicating entities that predated cellbased life forms (upper diagram in Fig. 1.4) . As early as the beginning of the twentieth century, H.J. Muller, L.T. Troland, and J.B.S. Haldane suggested that viruses represented primordial life forms. Influenced by the discovery of bacteriophages by F. d'Herelle, J.B.S. Haldane proposed viruses as intermediates between the prebiotic soup and primitive cells (reviewed in Lazcano, 2010) . In those times, knowledge of viruses was superficial from today's perspective, and lent to daring proposals coherent with viruses being perceived as simple. Despite simplicity and replication being features that could be also be attributed to primitive life forms, we have to distinguish the role that virus-like entities might have played in the establishment of early life from the possibility that present-day viruses reflect how early life might have been. An origin independent of cells is suggested by the presence of a number of "virus hallmark proteins" in viruses, which do not have counterparts in present-day cells, at least in the sequences represented in data banks. A number of protein folds (such as the double jelly-roll fold) found in the nucleoplasmic large DNA viruses (and related structures) are also present in viruses that infect the archaeon Sulfolobus, and in some single-stranded and double-stranded RNA viruses that infect eukaryotic cells. These unique structures and other features of viruses, such as genome size and its composition, constitute a "virus innate self" (Krupovic and Bamford, 2007) , that has been taken as an indication that viruses were constructed from modules that were different from those captured to build cells. The fact that some viral proteins do not have cellular homologs (or are only distantly related to cellular homologs) has favored the view that a virus world preceded the origin of cells (Forterre, 2005; Villarreal, 2005; Koonin et al., 2006; Koonin, 2009) . In this line of thought, the present-day RNA viruses are considered remnants of an RNA world; no other biological group that has been identified uses RNA as genetic material. Likewise, transposable elements, repeat sequences of mobile elements (LTR-retrotransposons, LINEs, SINEs, ALUs) and of telomeres and centromeres are considered to be likely of a viral origin [reviewed in Witzany, 2012) ]. FIGURE 1.4 Two possible courses of events regarding when viruses first appeared and participated in the evolution of the biosphere. The scheme of time frames and major biological events (RNA world, first cells, and organisms) are those displayed in Fig. 1.3 . According to the upper diagram, viruses (or previrus-like entities) arose together with the first (precellular) replicating entities. According to the second diagram, viruses (or previrus-like entities) arose when a cellular life had already been established. Presence of virus is generically represented by the external, thick, black curves. The internal red, wavy lines represent generation, dominance, and extinction of multiple viral lineages whose numbers and true dynamics will remain unknown. Illustration by C. Perales and E. Domingo from information retrieved from the different models of virus origin and references included in the text. Other present-day RNA genetic elements that include ribozymes as constituents [i.e., plant viroids, self-replicating RNAs of about 165e430 nucleotides in length (Hadidi et al., 2017) or the defective delta agent, also termed hepatitis delta virus (HDV)] might be a vestige of primitive genetic elements . HDV is dependent on hepatitis B virus (HBV) for the completion of its infectious cycle (Taylor and Pelchat, 2010) . The HDV genome is a mosaic RNA consisting of a viroid-like RNA and an RNA region whose complementary RNA (antigenomic strand) encodes two forms of a structural protein termed the delta antigen. Both the genomic and antigenomic RNAs possess a strong secondary structure with about 70% paired nucleotides. The delta antigen is encapsidated by the HBV surface antigen as a component of HDV particles. Thus, HDV appears to be the result of an RNA conjunction between a viroid-like RNA and an mRNA-coding region. Such conjoined RNAs might have been the precursors of the modern eukaryotic genome partition into coding sequences (exons) and intervening sequences (introns) (Sharp, 1985; Robertson, 1992 Robertson, , 1996 Chao, 2007; Taylor and Pelchat, 2010) . The structure of the HDV genome seems to echo processes that originated with primitive RNAs selected for their ability to replicate that incorporated a protein-coding moiety through recombination with other RNAs. Replication-competent and protein-coding chimeras might have signaled a relevant intermediate step toward more complex RNA and DNA genomes. The advent of an enzyme that could copy RNA into DNA to carry out reverse transcription should have been instrumental in generating primitive DNA viruses and other DNA-based genomes of increased complexity (Lazcano et al., 1992) . The discovery of the chimeric structure of the HDV genome illustrates how insights into the origin and early evolution of life can be gained from current genomics, despite lacking experimental approaches to recreate episodes that led to virus origins. Even if experiments could be designed, the time frame involved would occupy several generations of scientists, which is not feasible given the current research grant system. • Viruses originated from regressive evolution of microbes with a cellular organization and became parasites of cells. This theory is quite the opposite of the previous one because it presupposes that a cellular world was the source of viruses (lower diagram in Fig. 1.4) . One of its conceptual pillars is that the intricacies of virus-host relationships unveiled by molecular virology render unlikely that present-day RNA viruses are remnants of an ancestral RNA viral world. Also, the conditions prevailing in the RNA world did not necessitate that a primitive replicon displays rapid replicationda trait of most present-day virusesdfor it to become established, because of the scarcity of predators. The theory of a cellular origin of viruses was already put forward in the twentieth century when it was evidenced that complex DNA viruses encoded enzymes and immunomodulatory proteins that had cellular counterparts. The virus-generating cells could be autonomously functional from the onset (as autonomous as individual cells can be) or belong to a class of simple cells that parasitized functionally more advanced cells. One of the mechanisms by which viruses may have originated from cells is by regression or reduction. It implies that the number of genes of the virus-precursor cells diminished gradually or in a step-wise fashion to a minimum number compatible with replication at the expense of other cells that supplied basic components for gene expression and energy capture. The giant DNA viruses (Nasir et al., 2012; Colson et al., 2017) have been regarded as a possible missing link between a primitive DNA cell and the DNA viruses, and some of them include part of the translation apparatus (Abrahao et al., 2018) . The capacity to spread, so inherent to the concept of the virus, might have been first selected as a positive trait for cells, which then regressed toward a subcellular transmissible form. Prokaryotic cells can spread effectively among differentiated eukaryotic hosts, and tumor cells have been regarded as transmissible parasites (Banfield et al., 1965; Murgia et al., 2006; Pearse and Swift, 2006) . The observation that some "infectious cells" can be disseminated by insects provides a model for an early origin of arthropod-borne viruses. It has been recognized that transmission of the virus from an infected cell into a recipient cell need not involve a prolonged stay of the virus in the extracellular environment. Infection "synapses" allow intimate cell-to-cell contacts through which virus transmission takes place. It is estimated that synapsemediated transmission maybe 100-fold more efficient than the transmission of viral particles released into the extracellular environment. This is one of several mechanisms of bloc (multiple particles) transmission, which has several implications for viral quasispecies dynamics (Chapter 3). Acquisition of capacity for long-range transmission in space and time should have provided a sufficient selective advantage for a "cell-toexternal environment-to-cell" transmission to coexist with "cell-to-cell" transmission in our biosphere. The possibility that RNA viruses derive from some "organism" that used RNA as genetic material was suggested initially by D. Baltimore (1980) . The evidence that RNA-dependent RNA synthesis is rare in cells suggests that either RNA viruses derived their replicates from a now probably extinct "RNA organism" or that the viral replicates evolved from cellular DNA polymerases (Baltimore, 1980; Forterre, 2005 Forterre, , 2006a Yarus, 2010 ). • Viruses originated from cellular DNA or RNA that evolved to embody autonomous replication, and an extracellular step in their replication cycle. An alternative to the regressive or reduction of virus origin from cells is the escape theory. It proposes that part of the genome complement of cells found a way toward autonomous replication in the sense that they could replicate on their own as parasites of cells. The capacity to replicate could come from the parental cell or be acquired externally. The kind of autonomy attained implied an intracellular phase of their multiplication cycle and an extracellular phase in the form of transmissible particles (virions). This new way of life should have been positively selected if an increased capacity of cell-to-cell transfer conferred an advantage to the cells regarding the acquisition of new traits for functional diversification while maintaining a capacity to sustain virus multiplication. Decades ago, the view that viruses originated from subcellular organelles was a favored one (see e.g., Joklik, 1974) . Despite discernible sequence identity between some viral and mitochondrial DNA sequences, no evidence of viruses having functions encoded in cellular organelles and not in chromosomal DNA has been obtained, perhaps reflecting an earlier relationship between viruses and primitive freeliving cells rather than between viruses and modern (eukaryotic) cells (see also Section 1.7) • Viruses are as ancient as cells, and coevolved with cells or even with precellular genomic organizations, with which they shared functional modules. Current genomics of viruses and their host organisms (Bushman, 2002; Mount, 2004; Hacker and Dobrindt, 2006) tends to favor a long history of coevolution between viruses and cells. The structure and functions of many viral proteins do not deviate in any salient way from cellular counterparts. This applies to proteins involved in genome replication, and in proteolytic processing of proteins and polyprotein precursors. In contrast, the "virus innate self" that groups viral proteins and protein folds not shared with cellular counterparts, and that provides a powerful argument for a virus origin independent from cells (Section 1.5.1), does not exclude a coevolutionary mechanism even if all modules were not common to cellular and viral entities over a several million years period (see section 1.7). Comparative genomics suggests that the exchange of functional and structural modules through lateral gene transfers, together with fine adjustments mediated by mutation, have contributed to the coadaptation of cells and autonomous replicons over ancient evolutionary periods (Gorbalenya, 1995; Holland and Domingo, 1998; Jalasvuori and Bamford, 2008; Villarreal, 2008) . Even mechanisms that prompt viral variation in cell tropism (Chapter 4) have parallels in differentiated organism As an example, a two amino acids insertion into ectodysplasinda member of the tumor necrosis-binding familydalters its receptor specificity, and the differential expression of the two protein versions plays a role in epidermal morphogenesis (Yan et al., 2000) . Furthermore, as the number of threedimensional structures for viral cellular enzymes has increased to reach thousands, structural similarities between key cellular and viral enzymes (polymerases, proteases) have become apparent. Baltimore (1980) proposed that a limited number of "archetypal" proteins could be responsible for RNA virus function. He named as "archetypal" a "positive virus polymerase," a "negative virus polymerase," a set of "surface" proteins, and several "proteases," among other proteins and regulatory elements. The argument that "archetypal" modules could spread among positive-and negative-strand RNA viruses was based on features and mechanisms now recognized as much more profuse than in 1980: the multifunctionality of viral proteins, their capacity to diversify by mutation, and the existence of RNA recombination (Chapter 2). Regulatory strategies were also likely shared by cells and viruses. Small micro-RNAs that now populate the cellular world can act as molecular switches for RNA viral genomes to modulate their replication and gene expression (van Rij and Andino, 2008; Diaz-Toledano et al., 2009) . A likely course of events is that the long coevolution of protocells and primitive virus-like elements gradually shaped both cells at the individual and organismal level and the precursors of present-day viruses. • "Protoviruses" might have originated in primitive vesicles. Early protocellular communities probably lacked a cell wall or other compartmentalization barriers, an absence that allowed fluid transfers of metabolites and genetic material (Woese, 2002) . The most primitive vesicles based on lipid structures inaugurated the distinction between "internal" and "external" milieu and might have evolved to contain self-replicating macromolecules (Jalasvuori and Bamford, 2008; Adamala and Szostak, 2013; Hanczyc and Monnard, 2017) . Vesicles located in favorable microenvironments of a primitive Earth had the potential to exchange small molecules between the "inside" and "outside," and export materials toward other vesicles. Exchanges could assist in the coupling between genome replication and an incipient metabolism. Depending on their composition, lipid vesicles could form and remain stable at temperatures of about 100 C. Their transfer to lower temperatures might have modulated their permeability prior to the stage at which peptide or protein transporters were inserted as membrane components. Here again, "heterogeneities" became important: membranes made of mixtures of amphiphiles display increased thermostability, permeability, and tolerance to divalent cations. Budding vesicles endowed with traits such as growth, division, and permeability should have been positively selected for their ability to spread favorable replicating molecules and mediators of protocellular functions. Selected protoviral vesicles became gradually dispensable for the spreading of beneficial genes, and viruses evolved from being solely beneficial entities into also displaying a parasitic behavior that exploited cell resources (compare with Sections 1.7 and 1.8). According to this model, when cells became independent units surrounded by lipid-based membranes, viruses promoted the selection of cells that expressed peptidoglycan molecules on their surface to decrease or prevent virus infection. This transition could mark the onset of an arms race behavior that has been associated with a survival strategy of many present-day viruses. A cell wall allowed vertical transmission of genetic information and rendered the cell metabolically independent. In addition, it provided an osmotic environment suitable for energy production. Genetic information and molecular devices for energy capture, storage, and use were equally important for a sustainable cellular organization. Thus, according to this theory, viruses originated from protoviral elements whose main function was to spread useful genes horizontally. The evolved viruses acted as selective agents to promote microbial evolution and then became established in an increasingly differentiated cellular world (Villarreal, 2005 (Villarreal, , 2008 Hendrix, 2008; Jalasvuori and Bamford, 2008) . Geological studies indicate that mass extinctions of a brief duration (less than 100,000 years, a short geological time!) occurred at several points when multicellular organisms populated the Earth (some of them indicated in Fig. 1.3) . The end-Permian, end-Triassic, and end-Cretaceous extinctions rank among the most drastic, resulting in profound environmental perturbation (Burgess et al., 2014 and references therein) . As depicted in the form of red upward and downward curved lines in Fig. 1.4 , perturbations associated with viral extinctions (due to massive host extinctions) and severe bottleneck events have periodically blurred traces of many viruses unknown to us. Mechanisms implied by each of the five theories summarized here might have participated in the origin of viruses as we know them today, and any model will remain tentative for several obvious reasons. Viruses have not left a fossil record amenable to analysis with current technology. Moreover, viral genomes can evolve at very high rates in response to environmental necessities (Chapter 7), and reconstructions of ancient Earth environments are necessarily imprecise. For these reasons, viruses, independently on when they became active actors in the biosphere (Fig. 1.4) , are unlikely to have maintained molecular signatures that could shed light on their remote past. The viruses that infect fungi, the mycoviruses, recapitulate key issues on virus origins and diversity (Son et al., 2015; Roossinck, 2019) . Two main theories of their origin are that they evolved from plant viruses that invaded fungi or that they have coexisted and coevolved with their host fungi for a long time, in line with the long coevolution model of virus origins (Section 1.5.4). They are diverse regarding the genetic material that can be single-stranded or doublestranded RNA or DNA. They can be either asymptomatic, produce disease, or modify critical host traits, such as virulence, again reflecting a broad range of host interactions observed with 1.6 Teachings from mycoviruses other viral groups. The presence of a dsRNA mycovirus in a fungus is essential for the latter to confer heat tolerance to some plants; these findings have defined a three-way symbiosis required for an important phenotypic trait (Marquez et al., 2007; Roossinck, 2013 ) (see also Section 1.7.3). An often asked question is whether viruses are alive or not. The answer is debated, as reflected in the various definitions of virus (Box 1.3). Two opposite views coexist. One of them considers viruses as macromolecular aggregates, that is, viruses are regarded as "chemicals." A second proposal is championed by the "ribovirocell" concept of P. Forterre, which implies that a viable, virus-infected cell contains two different organisms that coexist symbiotically: the cell that produces virus and the replicating virus itself. Another facet of the same duality is manifested when viruses are considered merely as perturbing chemicals with no participation in the tree of life versus viruses being actually important players in the tree of life. The dual character of viruses as "alive" during intracellular replication and "not alive" outside the cell was stressed in some early literature (e.g., Davis et al., 1968) . As discussed in Section 1.4, if life is best defined as a conglomerate of complementary features, viruses display two of these features: the capacity to replicate and to evolve. Thus, although it is debatable whether viruses qualify as "alive," they are (and most likely have historically been) an integral part of life and the construction of life. Viruses have influenced the tree of life, as we know it, with lateral gene transfersdsome mediated by virusesdbeing a key element in its architecture and differentiation of the different domains of life (Ciccarelli et al., 2006; Weiss et al., 2018) . Because of the difficulties of defining "life" unambiguously, the reader might have noticed that in previous sections, the question addressed has been "how is life" rather than "what is life." The question of "why" have viruses persisted in the biosphere is addressed next. (1) possessing one type of nucleic acid, (2) multiplying in the form of their genetic material, (3) unable to grow and to undergo binary fission, and (4) devoid of a Lipmann system (Lwoff, 1957) . • Viruses are entities whose genomes are elements of nucleic acid that replicate inside living cells using the cellular synthetic machinery and causing the synthesis of specialized elements that can transfer the viral genome to other cells (Luria et al., 1978) . • Viruses are replicating microorganisms that are among the smallest of all life forms (first two editions of Fields Virology). • Viruses are transmissible deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) genetic elements that require a cell for multiplication (Domingo and Perales, 2014 ). Two general models have been proposed to explain the maintenance of viruses in the biological world: • Viruses have persisted because they parasitized opportunistically any cellular niche which was compatible with their replication apparatus, and which provided adequate resources. In this view, viruses are "selfish" replicating elements that became successful when increasingly efficient polymerization activities became part of their life cycles. However, several observations indicate that what was once considered purely "selfish" (also referred to as "junk") cellular DNA may not be that useless after all. Given the intimate connections between cells and viruses, the view that viruses primarily exist because they are mere "selfish" entities seems unlikely. A different issue is that under some particular environments, a virus displays a "selfish" element-like behavior, as one of the outcomes of their having been positively selected. An example, is the blind replication at the expense of elimination of hosts in disease processes, although this is necessarily a transient behavior. • The presence of viruses was positively selected because they promoted cellular variation and functional diversification. This proposal relates to some of the theories of virus origins (Section 1.5) and the early evolution of life that implied a role for virus-like entities. According to this view, viruses, together with other subcellular genetic elements (plasmids, retroelements, etc.) , penetrated the genetic material of ancestral cell forms, acted as agents of lateral gene transfers, and modified the expression profiles of the recipient cells. Probably, there has been (and there still is) a constant flow of genes between cells and viruses and other mobile genetic elements. The abundance of endogenous retroviruses in the mammalian genomes (García-Montojo et al., 2018) is a clear symptom of such a genetic flow. A nonfunctional viral infectivity factor (Vif) [the HIV-1 protein that can counteract the mutagenic activity of the apolipoprotein B mRNA editing complex (APOBEC) proteins; see Section 2.7 in Chapter 2] was found in the remnant of a rabbit endogenous retrovirus termed rabbit endogenous lentivirus type K (Katzourakis et al., 2007) . About 8% of the human genome is made of retroviral-like elements. Present-day human endogenous retroviruses probably contribute to pluripotency of human cells (Santoni et al., 2012) , and genome regulation (Chuong, 2018) . The fusion of cells from the placenta is mediated by syncytin, a protein in HERV-W endogenous retroviruses (Chuong, 2018; Villarreal, 2005) . In addition to the promotion of gene transfers to construct key cellular components, viruses probably acted as selective agents for cells to evolve defense mechanisms against viruses, and this may have originated new cellular functions. Also, viruses could favor the survival of some cell types over others, based on differential cell susceptibility to virus infection, thus contributing to cellular diversification. The need to escape viral infection may have furnished novel cell surface receptor proteins through a selection of cellular escape mutants (Buckling and Rainey, 2002; Saren et al., 2005) . Some experimental systems consisting of persistently infected cells in which the cells and the resident virus coevolve (Chapter 6) illustrate how viruses could act as selective agents to promote cellular variation. Such variation would not necessarily involve exchanges of genetic material between the virus and the cells, provided sufficient genetic variation of cells took place. Multicellular organisms devoid of viral entities should have endured a long-term disadvantage over an alternative scenario with the coexistence of cells and viruses. Selection by viruses need not be restricted to cells, and it can 1.7 Being alive versus being part of life be extended to entire host organisms and their populations. An abundance of hosts may promote viral epidemiological fitness, which is a factor in viral disease emergence (Chapters 5 and 7). Host subpopulations may be selected by their resistance to epidemic outbreaks by highly pathogenic viruses, such as in the 1918 influenza pandemics or currently with the AIDS or Zika pandemics, and Ebola outbreaks in some parts of Africa. Traditionally, plagues decimated the human population and acted as selective agents for differential survival of individuals. Selfish-opportunistic and selected-functional are not incompatible models of virus maintenance. Once the instruction to replicate had been positively selected, selfish elements could ensue. As we learn about viral and cellular genomics, the current promiscuity and diversity of viruses (Section 1.3) appear as complementary agencies to promote general biological evolution following Darwinian mechanisms. Viruses might have contributed to the DNA replication machinery of cells, to the formation of the eukaryotic cell nucleus, and to a number of developmental processes (Baranowski et al., 2001; Bushman, 2002; Bacarese-Hamilton et al., 2004; Mallet et al., 2004; Villarreal, 2005 Villarreal, , 2008 Forterre, 2006a) . Cells are a necessity for viruses and viruses are promoters of cell diversity and as a consequence, of cellular differentiation (compartmentalization and functional specialization). Present-day viruses reveal several mechanisms of exchange of genetic material that might have roots in early cellular evolution. Temperate bacteriophages (the prototypic example being E. coli phage l) integrate their genomic DNA in the DNA of their host bacteria. The uptake of cellular genes by viruses has been amply documented in transducing bacteriophages (those that can transfer DNA from one bacterium to another), as well as in RNA and DNA tumor viruses. Even RNA viruses that are not known to include a reverse transcription step in their replication cycle can incorporate host RNA sequences. Replication-competent, cytopathic variants of bovine viral diarrhea virus (a type species of the genus Pestivirus of the important family of pathogens Flaviviridae) can acquire cellular mRNA sequences in their genome, via nonhomologous recombination (Meyers et al., 1989) . Insertion of 28S ribosomal RNA sequences into the hemagglutinin gene of influenza virus increased its pathogenicity (Khatchikian et al., 1989) . Some defective-interfering particles of Sindbis virus included cellular tRNA sequences at their 5 0 -ends (Monroe and Schlesinger, 1983) . Sequences related to some flaviviruses can persist in an integrated form into the DNA of the insect vectors Aedes albopictus and Aedes aegypti (Crochu et al., 2004) . Endogenous hepatitis B viruses (eHBVs) have been identified in the genomes of birds and land vertebrates (amniotes), crocodilians, snakes, and turtles. The evidence is that eHBVs are more than 207 million years old and that ancient HBV-like viruses infected animals during the Mesozoic Era (Suh et al., 2014, Fig. 1.4) . The existence of alternative mechanisms for the integration of viral genetic material into cellular DNA suggests an ancient origin and a selective advantage of exchanges of genetic information in shaping a diverse and adaptable cellular world (Eigen, 1992 (Eigen, , 2013 Gibbs et al., 1995; Villarreal, 2005 Villarreal, , 2008 ). Symbiosis is an important determinant of coevolutionary interactions between hosts and their parasites (Vorburger and Perlman, 2018) . Frequent symbiotic and mutualistic interactions have been established with viruses. Human endogenous retroviruses can protect human tissues and the developing fetus against infection by some exogenous retroviruses (Ryan, 2004) . Some bacteria require bacteriophage to express virulence determinants (Tinsley et al., 2006) . Symbiosis can be established between bacteriophages and animals (Barr et al., 2013) . Several plant RNA viruses delay the symptoms of abiotic stress, such as those produced by drought and frost (dehydration, osmotic stress, and oxidative stress). Protection is mediated by increased levels of osmoprotectants and antioxidants in the infected plants (Xu et al., 2008) . Symbiotic relationships represent a state of local equilibrium between viruses and hosts, triggered by compatibility and occasionally by mutual benefits. An arms race implied by the virus-host interactions described in previous sections might have been the modus vivendi for viruses, only interrupted by occasional armistices. Alternatively, symbiotic and mutualistic interactions might have been the norm, only interrupted by occasional defections by killer personalities that have become the key actors of hospital wards and virology textbooks (Roossinck, 2011; Li and Delwart, 2011) . As discussed in connection with natural counterparts of the transition toward error catastrophe in viruses (Chapter 9), cellular editing activities such as those displayed by some of the adenosine deaminase acting on double-stranded RNA and APOBEC proteins are also part of the innate immune response against some viruses. In turn, viruses have evolved multiple functions to counteract the host immune response (Chapter 4). The recruiting of cellular functions to confront viruses attests of transient losses of an equilibrated coexistence. Some middle-and long-term equilibrium between virus and host population numbers must be continuously restored by selective events; otherwise, this book would not have been written. The evolutionary origin of defense mechanisms against viruses can be regarded as a response to an excessive number of virus-cell interactions. Superinfection exclusion is one of the mechanisms used by present-day cells to limit replication of a virus when another one is actively replicating (or incorporated) into the same cell. Exclusion has a biochemical interpretation in the competition of two viral entities for cellular resources, but it might have been boosted by early cellular adaptation to limit viral invasions (Chapter 4). Likewise, components of the intrinsic and innate immune response that prevent infection and disease might have also been endowed with activities that promoted cell variation for adaptability. The two opposite views of the activity of viruses in the biosphere (i.e., opportunistic occupation of any suitable cellular niche or intimate cooperative coevolution with host cells) would be expected to produce a different proportion of pathogenic viruses. Opportunistic invasions should lead mainly to disease-prone viruses, while long coevolutionary periods should lead to a dominance of nonpathogenic viruses. Not all viruses that have been characterized are pathogenic, and in fact, only a minority of those that exist might be. However, since only a limited number of the viral genomes predicted by metagenomic surveys have been characterized, it is not possible to adventure a proportion of beneficial or neutral versus harmful viruses. In the course of investigations on poliomyelitis, a search for related viruses was undertaken, and a number of new viruses later to be known as echoviruses were discovered. They were isolated because they caused cytopathology to cells in culture. The virus-containing samples were from individuals that did not show symptoms of a viral infection. The new isolates were designated as "orphans," meaning viruses without the disease. The term echovirus derives from the enteric cytopathogenic human orphan virus. Some of them, or their close relatives, were later associated with disease syndromes, but others were not. Viruses, as diverse as circoviruses, polyomaviruses, or herpesviruses colonize a 1.8 Virus and disease considerable proportion of animals, and only some of the virus types are the direct cause of disease [e.g., postweaning multisystemic wasting syndrome (PMWS) by porcine circovirus type 2, or cancer by some polyomaviruses, among many other examples]. Disease potential is unrelated to viral genome size. PMWS is associated with the smallest mammalian DNA virus genome of only 1.7 Kb, while the almost 100-fold larger herpes virus genomes can coexist with immunocompetent humans without the noticeable disease. The pathogenic character of a virus depends on the intricacies of virus-host interactions that are poorly understood. The more the knowledge of virus-host interactions, the border between the virus being pathogenic or nonpathogenic becomes fuzzier. Viruses may not damage essential cell functions but may affect dispensable cellular functions. Studies by M.B.A. Oldstone and his colleagues on persistent infections of lymphocytic choriomeningitis virus in neuroblastoma cells demonstrated that the resident virus altered the expression of differentiated cell trait while preserving vital functions (Oldstone et al., 1977) . This is one of many examples of virusinduced modifications of the so-called luxury functions of cells (Oldstone, 1984) . Disease manifestations depend on multiple host and viral factors (allelic forms of several genes, microbiome composition, immune status, coinfections with viruses or other agents, and so on). The dual infections with HIV-1 and hepatitis C virus (HCV) during the end of the 20th century accelerated the evolution of HCV-associated liver disease. In the following chapters, molecular mechanisms of virus variation are reviewed in connection with virus survival and host interactions. Fig. 1.3) . In each box, the arrow indicates a succession of major developments, probably with overlapping phases. An evolving virosphere is portrayed as having influenced the establishment of the major cellular domains of life. Two of several proposed cell differentiation routes into life domains are drawn. LGT (lateral gene transfer) are depicted as horizontal discontinuous lines, assumed to be more abundant during pre-cellular and early cellular evolution. Illustration taken from Domingo, E., Perales, C., 2019. Virus evolution. In: Microbial Ecology and Evolution. Encyclopedia of Microbiology. fourth ed. Elsevier, Amsterdam), with permission. A reasonable hypothesis is that mechanisms similar to those observed in the present-day biosphere and virosphere were also in operation during the multiple transitions from a primitive replicon-based biosphere to a cellular world differentiated into different domains of life ( Fig. 1.5 ). Although this point will again be examined as part of an overview in the closing chapter of the book (Chapter 10), here, some of the observations on the current biosphere dynamics that bear on the difficulties encountered in reconstructing the sequential transitions toward cellular differentiation are outlined. The following circumstances could have blurred the relationships among evolving previral and precellular entities and among viral genomes under construction, from the perspective of current genomics: (i) episodes of evolution at rates higher than average might historically have accelerated evolution and extinction of viral and cellular genetic elements. (ii) Experimental evolution has documented that viruses often have available multiple mutational pathways to gain fitness in changing environments. This could well apply to other genome assemblies at different stages of complexity. The genome assemblies and linkages we observe are a set among many others possible that could have existed for prolonged time periods but that are now extinct. A related aspect is that the acquisition of genomic modules from other cells or viral elements could render prior acquisitions obsolete or even detrimental. In consequence, an observed absence of shared genomic regions among different viral or cellular lineages from different life domains does not exclude that the shared domains had been transiently (transiently meaning million years!), with no trace in present-day viruses or cells. (iii) Bottleneck events could have erased biological organizations that had a similar degree of adaptation than others that survived. (iv) Fitness variation associated with genomic changes (due to mutation, recombination or lateral gene transfers) is environment-dependent, and the changes in the environment over millions of years might have been dramatic. Such changes might have affected a limited number of local environments or most environments globally. Furthermore, most of these potentially perturbing events might have taken place many times in unpredictable successions, affecting local and few or general and many environments, thus rendering reconstruction of past events from current genomics problematic (Domingo and Perales, 2019) . Again, these contingencies will be more apparent when different facets of viral dynamics are dissected in the following chapters. Contrary to the usual practice in experimental virology, the contents of this chapter have forced a considerable degree of uncertainty, and at times, speculative argumentation. It was, however, a necessary exercise, since at least part of what we see today in viruses must have roots in the origin of life and the role of viruses in life development during epochs that humans have not witnessed. We are left with trying to reconstruct. This chapter is excitingly as close to physics and chemistry as it is to biology. Many of the questions addressed are open to debate, and they will probably remain open for a long time. Interestingly, the terms "heterogeneity" and "complexity" have appeared several times in this chapter, anticipating features of present-day viruses discussed in the upcoming chapters. Complex populations, be they of viruses, cells, protocells, lipid vesicles, primitive replicons, or peptide soups, are the raw materials on which natural selection can act. Primitiveness does not imply simplicity. Complexity was likely a constant trait for early and modern life, both for the extinguished unknown viruses and for those we strive to understand and control (see Summary Box). • Life is characterized by four integrated features: replication, evolvability, metabolism, and compartmentalization. It is debated whether replication or metabolism was the dominant triggering factor. • At least two ancestral positive selection events might have contributed to the development of life: the selection of replicating over nonreplicating polymers, and selection of splitting-prone versus nonsplitting-prone membrane vesicles. • Present-day viruses may be descendants from the primitive replicons that participated in early life prior to cellular organizations, or from structured cells by escape or reduction. • Key features of many present-day viruses may be an inheritance of primitive replicons, notably error-prone replication, spread through membrane structures, and tendency to engage in genome integration and transfers. • The geological record indicates multiple, brief, and drastic mass extinction events in Earth's history. Such events anticipate mass extinction of the resident viruses. A dynamics of emergences, reemergences, and extinctions might have operated historically at a grandscale, with mechanisms similar to those we observe with present-day viruses. Present virosphere dynamics suggests difficulties to reconstruct events that spanned millions of years. Tailed giant Tupanvirus possesses the most complete translational apparatus of the known virosphere Structure unifies the viral universe Nonenzymatic templatedirected RNA synthesis inside model protocells Introduction to Artificial Life A strategy to estimate unknown viral diversity in mammals. mBio 4, e00598 Perceptions of science. Prebiotic soupdrevisiting the Miller experiment Evolution of RNA viruses Mosquito transmission of a reticulum cell sarcoma of hamsters Evolution of cell recognition by viruses Bacteriophage adhering to mucus provide a non-host-derived immunity The Physical Basis of Life. Routledge and Kegan Paul Biological diversity and public health Tryckte hos Carl Del en Catalytic activity of hammerhead ribozymes in a clay mineral environment: implications for the RNA world Global distribution of nearly identical phage-encoded DNA sequences Antagonistic coevolution between a bacterium and a bacteriophage Introduction to virus origins and their role in biological evolution High-precision timeline for Earth's most severe extinction Lateral DNA Transfer. Mechanisms and Consequences On dating stages in prebiotic chemical evolution Structural phylogenomics retrodicts the origin of the genetic code and uncovers the evolutionary impact of protein flexibility First molecules, biological chirality, origin(s) of life Environmental adaptation from the origin of life to the last universal common ancestor UV radiation from the young Sun and oxygen and ozone levels in the prebiological palaeoatmosphere Giant vesicles "colonies": a model for primitive cell communities. Chembiochem 13, 1497e1502. Cast on Chemical synthesis of poliovirus cDNA: generation of infectious virus in the absence of natural template The planetary setting of prebiotic evolution RNA recombination in hepatitis delta virus: implications regarding the abilities of mammalian RNA polymerases The placenta goes viral: retroviruses control gene expression in pregnancy Toward automatic reconstruction of a highly resolved tree of life Mimivirus: leading the way in the Discovery of giant viruses of amoebae The DNA Provirus. Howard Temin's Scientific Legacy History of Life The origin of the genetic code Sequences of flavivirus-related RNA viruses persist in DNA form integrated in the genome of Aedes spp. mosquitoes Pathways for the formation and evolution of peptides in prebiotic environments Microbiology. Hoeber Medical Division Life Evolving. Molecules, Mind and Meaning Boltzmann, Darwin and directionality theory Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences In vitro characterization of a miR-122-sensitive double-helical switch element in the 5' region of hepatitis C virus RNA Atmospheric aerosols as prebiotic chemical reactors Encyclopedia of Life Sciences Virus evolution The roads to and from the RNA world The evolution of organic matter in space Steps Towards Life On the origin of biological information Error catastrophe and antiviral strategy The Hypercycle. A Principle of Natural Self-Organization Viral discovery as a tool for pandemic preparedness The forgotten dispute: AI Oparin and HJ Muller on the origin of life Synthesis of long prebiotic oligomers on mineral surfaces The two ages of the RNA world, and the transition to the DNA world: a story of viruses and cells The origin of viruses and their possible roles in major evolutionary transitions Three RNA cells for ribosomal lineages and three DNA viruses to replicate their genomes: a hypothesis for the origin of cellular domain Molecular Evolution and the Origins of Life Human endogenous retroviruseseK (HML-2): a comprehensive review Sources of geochemical evolution of RNA precursor molecules: the role of phosphate Molecular Basis of Virus Evolution The RNA world Origin of RNA viral genomes; approaching the problem by comparative sequence analysis In situ observation of peptide bond formation at the water-air interface Pathogenomics: Genome Analysis of Pathogenic Microbes 2017. Viroids and Satellites Primordial membranes: more than simple container boundaries Evolution of dsDNA tailed phages Transitions in understanding of RNA viruses: an historical perspective Origin and evolution of viruses Astrobiology, the Quest for the Conditions of Life Endosome maturation A nonenzymatic RNA polymerase model Structural co-evolution of viruses and cells in the primordial world Relics from the RNA world Evolution in viruses Progress in studies on the RNA world Directed evolution of nucleic acid enzymes Discovery and analysis of the first endogenous lentivirus The Origins of Order. Self-Organization and Selection in Evolution Increased viral pathogenicity after insertion of a 28S ribosomal RNA sequence into the haemagglutinin gene of an influenza virus The ancient Virus World and evolution of cells On the origin of cells and viruses: primordial virus world scenario A virocentric perspective on the evolution of life Origin and evolution of the universal genetic code Putative prophages related to lytic tailless marine dsDNA phage PM2 are widespread in the genomes of aquatic bacteria Introduction to virus origins and their role in biological evolution The virophage as a unique parasite of the giant mimivirus Systems protobiology: origin of life in lipid catalytic networks The biodiversity and ecology of Antarctic lakes: models for evolution The RNA world, its predecessors, and its descendants The transition form nonliving to living On the early emergence of reverse transcription: theoretical basis and experimental evidence High diversity of the viral community from an Antarctic lake General Virology The concept of virus The endogenous retroviral locus ERVWE1 is a bona fide gene involved in hominoid placental physiology The Origins of Life. From the Birth of Life to the Origins of Language Ubiquitin in a togavirus A production of amino acids under possible primitive Earth conditions The Origins of Life on Earth Prebiotic synthesis from CO atmospheres: implications for the origins of life RNAs from two independently isolated defective interfering particles of Sindbis virus contain a cellular tRNA sequence at their 5 0 ends Beginnings of Cellular Life: Metabolism Recapitulates Biogenesis Bioinformatics Sequence and Genome Analysis Geochemistry and the origin of life: from extraterrestrial processes, chemical evolution on Earth Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea. Bacteria and Eukarya Virus can alter cell function without causing cell pathology: disordered function leads to imbalance of homeostasis and disease Alterations of acetylcholine enzymes in neuroblastoma cells persistently infected with lymphocytic choriomeningitis virus Origin of Life Is cyanoacetylene prebiotic? Prebiotic chemistry and the origin of the RNA world Prebiotic synthesis of methionine and other sulfur-containing organic compounds on the primitive Earth: a contemporary reassessment based on an unpublished 1958 Stanley Miller experiment Memoire sur les corpuscules organiz es qui existent dans l'atmosphere. Examen de la doctrine des g enerations spontan es Allograft theory: transmission of devil facial-tumour disease Effects of salinity on the adsorption of nucleotides onto phyllosilicates Automatic comparison and classification of protein structures Prebiotic synthesis of simple sugars by photoredox systems chemistry Replication and evolution of viroid-like pathogens How did replicating and coding RNAs first get together? The origins of the RNA world Genetic relatedness of hepatitis A virus strains recovered from different geographical regions On origin of genetic code and tRNA before translation Molecular Evolution: Prebiological and Biological The good viruses: viral mutualistic symbioses Plant virus ecology Evolutionary and ecological links between plant and fungal viruses Prebiotic systems chemistry: new perspectives for the origins of life Human endogenous retroviruses in health and disease: a symbiotic perspective HERV-H RNA is abundant in human embryonic stem cells and a precise marker for pluripotency A snapshot of viral evolution from genome analysis of the tectiviridae family Template-directed synthesis of novel, nucleic acid-like structures A cross-chiral RNA polymerase ribozyme Genetic variation in natural populations A periodic pattern of mRNA secondary structure created by the genetic code On the origin of RNA splicing and introns Five questions about mycoviruses A remarkable self-organization process as the origin of primitive functional cells Early mesozoic coexistence of amniotes and hepadnaviridae Marine virusesdmajor players in the global ecosystem The origin of the genetic code: amino acids as cofactors in an RNA world An optimal degree of physical and chemical heterogeneity for the origin of life? Origin of hepatitis delta virus Bacteriophages and pathogenicity: more than just providing a toxin? Microb Onset of natural selection in populations of autocatalytic heteropolymers Earth's early atmosphere: constraints and opportunities for early evolution The oxidation state of Hadean magmas and implications for early Earth's atmosphere The complex interactions of viruses and the RNAi machinery: a driving force in viral evolution Introduction to virus origins and their role in biological evolution Viruses and the Evolution of Life The widespread evolutionary significance of viruses The role of defensive symbionts in host-parasite coevolution Vitalists and virulists: a theory of self-expanding reproduction The last universal common ancestor between ancient Earth chemistry and the onset of genetics Metagenomic characterization of airborne viral DNA diversity in the near-surface atmosphere From molecular entities to competent agents: viral-infection-derived consortia act as natural genetic engineers Ribozyme-catalyzed transcription of an active ribozyme On the evolution of cells Virus infection improves drought tolerance Two-amino acid molecular switch in an epithelial morphogen that regulates binding to two distinct receptors Life from an RNA world. The Ancestor Within