key: cord-310051-bl8l4bgo authors: Leitner, Thomas; Kumar, Sudhir title: Where did SARS-CoV-2 come from? date: 2020-07-06 journal: Mol Biol Evol DOI: 10.1093/molbev/msaa162 sha: doc_id: 310051 cord_uid: bl8l4bgo Identifying the origin of SARS-CoV-2, the etiological agent of the current COVID-19 pandemic, may help us to avoid future epidemics of coronavirus and other zoonoses. Several theories about the zoonotic origin of SARS-CoV-2 have recently been proposed. While Betacoronavirus found in Rhinolophus bats from China have been broadly implicated, their genetic dissimilarity to SARS-CoV-2 is so high that they are highly unlikely to be its direct ancestors. Thus, an intermediary host is suspected to link bat to human coronaviruses. Based on genomic CpG dinucleotide patterns in different coronaviruses from different hosts, it was suggested that SARS-CoV-2 might have evolved in a canid gastro-intestinal tract prior to transmission to humans. However, similar CpG patterns are now reported in coronaviruses from other hosts, including bats themselves and pangolins. Therefore, reduced genomic CpG alone is not a highly predictive biomarker, suggesting a need for additional biomarkers to reveal intermediate hosts or tissues. The hunt for the zoonotic origin of SARS-CoV-2 continues. About seven months after a new coronavirus started to spread among humans in Wuhan, China, almost 10 million confirmed cases and half a million deaths have occurred worldwide. The new coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is the second zoonosis since a similar Chinese bat coronavirus, SARS-CoV-1, caused an epidemic of severe human respiratory disease 17 years ago (Drosten et al. 2003) . Several other coronaviruses infect humans, including the Middle East respiratory syndrome (MERS)-CoV that came from camels and also caused a severe human disease (Dudas et al. 2018) , and four coronaviruses that typically cause only mild disease (Corman et al. 2018) . Also, many SARSrelated coronaviruses have been identified in bats (Li et al. 2005; Yang et al. 2013; Hu et al. 2017) , with the potential to infect humans (Ge et al. 2013) . Understanding the zoonotic origins of the coronavirus and other viruses is critical because such knowledge can be used to prevent future zoonotic outbreaks. While it is possible that a bat coronavirus jumped directly to a human, the closest known bat virus, RaTG13 found in a Rhinolophus affinis bat , shows 96% genomic similarity to SARS-CoV-2. Moreover, RaTG13 differs substantially in the receptor-binding domain (RBD) of the spike protein, suggesting that it may not bind efficiently to the human receptor ACE2 (Andersen et al. 2020) . Other bat coronaviruses found in Rhinolophus pusillus are less similar, at about 88% genomic similarity (Li et al. 2005) . Hence, it seems likely that either there are other, closer, coronaviruses in bats as yet unsampled, or another host species has acted as an intermediary between bats and humans. In either case, because SARS-CoV-2 spreads so acquired the necessary mutations in the RBD to make it transmissible between humans before its zoonotic transfer. Several candidates for the intermediary host have been proposed. Early circumstantial evidence pointed to snakes sold at the Wuhan market where the SARS-CoV-2 outbreak started, as the codon usage of SARS-CoV-2 was similar to that observed in snakes. However, no coronavirus has been found in snakes. Turtles were subsequently proposed based on predicted spike RBD and ACE2 interactions. Both snakes and turtles were later rejected as candidate intermediate hosts as stronger spike RBD-ACE2 interactions were predicted in ruminants and rodents (Luan et al. 2020) . Pangolins were implicated based on the identification of several SARS-CoV-2related viruses, including one with a similar RBD to SARS-CoV-2 (Lam et al. 2020) , and 90-100% amino acid identity to different SARS-CoV-2 proteins (Xiao et al. 2020) . Recently, feral dogs were proposed to be the intermediary host of SARS-CoV-2 (Xia 2020). Following the hypothesis that genomic methylation may be influenced by defenses specific to a host environment, Xia compared CpG deficiencies of many alphacoronavirus and betacoronavirus genomes from multiple host species to that of SARS-CoV-2. The closest match occurred in canine coronaviruses, which are known to infect canine digestive tracts (Licitra et al. 2014 ). The observation that many SARS-CoV-2 infected humans report digestive symptoms and have high viral loads in stool samples (Wölfel et al. 2020) provided tangential support to the idea that this virus may also infect digestive tissues. Based on CpG patterns, Xia (2020) suggested that a bat virus had entered the canine digestive tract where it experienced high RNA methylation activity (giving it the observed CpG signature) before transmission to humans. Pollock et al. (2020) have now tested Xia (2020)'s hypothesis using an extended dataset, and report multiple bat and pangolin betacoronaviruses with CpG patterns similar to those found in dogs. Canine coronaviruses were not the only viruses with CpG patterns similar to those observed in SARS-CoV-2. Therefore, reduced genomic CpG content alone cannot predict the zoonotic origin of SARS-CoV-2, even though Xia (2020) (Pollock et al. 2020; Xia 2020) will also be vital to discovering intermediate hosts and tissues in which SARS-CoV-2 gained the ability to cause a pandemic. Identifying the zoonotic source of an emerging pathogen may facilitate efficient containment in the early stages of an outbreak. Furthermore, virus sequence data from the actual source-host provides an essential outgroup for accurate assessment of early spread, as using a virus too distantly-related or divergent may mislead molecular epidemiological reconstructions. This effect was recently shown for SARS-CoV-2 (Mavian et al. 2020) . Crucially, identification of the source host may prevent future viral introductions through higher awareness, systematic screening, and development of testing protocols. Identification of the zoonotic origin of SARS-CoV-2 may be particularly challenging, as coronaviruses frequently recombine and are found in many different host species in the wild (Graham and Baric 2010) . The hunt for the source is far from over, and the origin of the pandemic will likely only be revealed through more extensive sampling and careful phylogenetic analyses. The proximal origin of SARS-CoV-2 Chapter Eight -Hosts and Sources of Endemic Human Coronaviruses Identification of a Novel Coronavirus in Patients with Severe Acute Respiratory Syndrome MERS-CoV spillover at the camel-human interface Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Recombination, Reservoirs, and the Modular Spike: Mechanisms of Coronavirus Cross-Species Transmission Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins Bats Are Natural Reservoirs of SARS-Like Coronaviruses Canine Enteric Coronaviruses: Emerging Viral Pathogens with Distinct Recombinant Spike Proteins SARS-CoV-2 spike protein favors ACE2 from Bovidae and Cricetidae Sampling bias and incorrect rooting make phylogenetic network tracing of SARS-CoV-2 infections unreliable Viral CpG deficiency provides no evidence that dogs were intermediate hosts for SARS-CoV-2 Virological assessment of hospitalized patients with COVID-2019 Extreme Genomic CpG Deficiency in SARS-CoV-2 and Evasion of Host Antiviral Defense Isolation of SARS-CoV-2-related coronavirus from Malayan pangolins Novel SARSlike betacoronaviruses in bats A pneumonia outbreak associated with a new coronavirus of probable bat origin