key: cord-333712-sdtxi8xw authors: Yu, Ping; Hu, Ben; Shi, Zheng-Li; Cui, Jie title: Geographical structure of bat SARS-related coronaviruses date: 2019-02-06 journal: Infect Genet Evol DOI: 10.1016/j.meegid.2019.02.001 sha: doc_id: 333712 cord_uid: sdtxi8xw Bats are the natural reservoirs of severe acute respiratory syndrome coronavirus (SARS-CoV) which caused the outbreak of human SARS in 2002–2003. We introduce the genetic diversity of SARS-related coronaviruses (SARSr-CoVs) discovered in bats and provide insights on the bat origin of human SARS. We also analyze the viral geographical structure that may improve our understanding of the evolution of bat SARSr-CoVs. Coronaviruses (CoVs) are enveloped positive-sense, single-stranded RNA viruses belonging to the subfamily Coronavirinae, family Coronaviridae, in the order Nidovirales, which are further divided into four genera, Alpha-, Beta-, Gamma-and Deltacoronavirus (de Groot et al., 2011; Payne, 2017) . CoVs are the pathogenic agents for both avian and mammals, and have a worldwide distribution, usually causing respiratory diseases when infecting humans. In 2002-2003, a novel coronavirus termed Severe Acute Respiratory Syndrome (SARS) coronavirus caused > 8000 cases of infection with a mortality of approximately 10%, drawing the attention for CoVs of zoonotic origin (Ksiazek et al., 2003; Peiris et al., 2004) . Subsequently, more CoVs were identified from humans and different animals, containing human coronavirus NL63 (HCoV-NL63), HCoV-HKU1, Middle East respiratory syndrome coronavirus (MERS-CoV), swine acute diarrhoea syndrome coronavirus (SADS-CoV), bat-CoV HKU4, bat-CoV HKU5, white-eye coronavirus HKU16 (WECoV HKU16), sparrow coronavirus HKU17 (SpCoV HKU17), magpie robin coronavirus (MRCoV HKU18) and so on (Raj et al., 2014; Su et al., 2016; Woo et al., 2012; Woo et al., 2007; Zhou et al., 2018) , indicating that CoVs have greater diversity and host range than estimated and remain a potential risk for the public health. Frequent contacts with humans and animals carrying coronaviruses provide a greater chance to facilitate cross-species viral transmission and emerge new viral variants. In late 2002, SARS first emerged in Guangdong Province in southern China, and rapidly spread to other provinces and other countries, resulting in a global pandemic of severe respiratory diseases (Zhong et al., 2003) . Initial investigations and researches indicated that marketplace masked palm civets (Paguma larvata) were likely to be the animal origin for SARS coronavirus (SARS-CoV) Kan et al., 2005; Song et al., 2005) , but no SARS-CoV was detected in farmed or wildcaught civets in the subsequent epidemiological studies, revealing that civets probably served only as intermediate hosts for SARS-CoV transmission (Chan and Chan, 2013; Shi and Hu, 2008; Tu et al., 2004) . In 2005, the discovery of novel CoVs related to human SARS-CoVs in Chinese horseshoe bats (genus Rhinolophus), named SARS-related coronaviruses (SARSr-CoVs), provided new clue that bats may be the natural host for SARS-CoV (Lau et al., 2005; Li et al., 2005) . Since then, genetically diverse SARSr-CoVs have been discovered in Asia, Europe, and Africa, including China, South Korea, Thailand, Bulgaria, Slovenia, Italy, Luxembourg, Nigeria, and Kenya (Balboni et al., 2012b; Drexler et al., 2010; He et al., 2014; Lau et al., 2010; Lau et al., 2005; Li et al., 2005; Pauly et al., 2017; Ren et al., 2006; Rihtaric et al., 2010; Yang et al., 2013; Yuan et al., 2010) . Importantly, it was reported that some bat SARSr-CoVs were able to use angiotensin converting enzyme II (ACE2) from humans, civets and Chinese horseshoe bats as a receptor for cell entry (Ge et al., 2013) , further supporting human SARS-CoV originated from Chinese horseshoe bats and suggesting that these SARSr-CoVs had the ability to infect humans immediately without other intermediate hosts. Furthermore, serological evidence by ELISA of infection of bat SARSr-CoVs in human who live close to the bat cave in Yunnan, China, where diverse SARSr-CoVs were detected in bats, suggested the potential spillover of SARSr-CoVs from bats to humans . SARS-CoV and SARSr-CoVs belong to lineage B of genus Betacoronavirus in the family Coronaviridae and share the same genomic organization with other coronaviruses, including genes coding for 16 nonstructural proteins (nsp, in ORF1ab domain), the structural proteins like spike protein (S), envelope (E), membrane (M), nucleocapsid (N) and other several genes (Perlman and Netland, 2009; Woo et al., 2009) . The major distinction between SARS-CoV and SARSr-CoV genomes lies in the non-structural protein 3 (nsp3), ORF3, S and ORF8, among which S gene and ORF8 are the most variable (Shi and Hu, 2008; Wu et al., 2016) . The S gene coding for spike protein can be further divided into two subunits S1 and S2, responsible for receptor binding and cellular membrane fusion, respectively (Belouzard et al., 2009) . The S1 subunit is composed of the N-terminal domain (NTD) and the receptor-binding domain (RBD), the latter of which is critical for host-receptor binding and plays an important role on determining host range (Becker et al., 2008; de Haan et al., 2006; Li, 2013; Schickli et al., 2004; Tusell et al., 2007) . Compared with human/civet SARS-CoV, most known SARSr-CoVs had two deletions in the RBD domain such as Rp3 (DQ071615), while a Bulgarian strain BM48-31 (GU190215) from Rhinolophus blasii had only one deletion in that region (Drexler et al., 2010; He et al., 2014) . Several strains like WIV1 (KF367457) had same sequences length with SARS-CoV in the RBD regions, which were authenticated to be able to use human ACE2 as a cellular entry receptor (Ge et al., 2013; Hu et al., 2017; Yang et al., 2015) . However, these SARSr-CoVs without any deletions have so far been merely discovered in Yunnan, indicating that the origin of the S genes of the immediate ancestors of SARS-CoV had been restricted in Yunnan. The ORF8 was highly variable during the course of the SARS epidemic in China (CSME, 2004) . Most bat SARSr-CoVs (except the strain HKU3-8, Rs4084 and African and European bat SARSr-CoVs) and the early human SARS-CoV contain a single ORF8 (Balboni et al., 2012a) . The HKU3-8 (GQ153543) has a 26 nt deletion in the ORF8 gene which subdivides its ORF8 into ORF8a, b, c. The ORF8 of Rs4084 is split into 8a and 8b due to a 5 nt deletion in its ORF8, similar to the ORF8a/8b of the middle/late human SARS-CoVs with a 29-nt deletion in the ORF8. In the European strain BM48-31, the ORF8 was entirely absent (Drexler et al., 2010; Hu et al., 2017; Lau et al., 2010) . Moreover, compared with other bat SARSr-CoVs, some viruses such as WIV1 and WIV16 had an additional ORF (named ORFx) in their gene organization, involved in modulation of the host immune response (Hu et al., 2017; Yang et al., 2015; Zeng et al., 2016) . SARSr-CoVs have been detected in bats from a wide range of provinces in China, including Guangdong, Guangxi, Guizhou, Hebei, Henan, Hong Kong, Hubei, Jilin, Shaanxi, Shanxi, Taiwan and Zhejiang (Table 1) . Except several from Hipposideridae, these viruses were mainly detected in bats from the family Rhinolophidae, indicating that they are likely to be natural hosts for SARSr-CoVs. We collected the full-length RNA-dependent RNA polymerase (RdRp) sequences of previously reported SARSr-CoVs and SARS-CoVs retrieved from GenBank (Table S1 ). We used the Xia' test, Phi test/RDP and likelihood mapping analysis to check the Saturation Index, recombination and phylogenetic signal of our data, respectively before performing the phylogenetic reconstruction (Huson and Bryant, 2006; Martin et al., 2015; Strimmer and von Haeseler, 1997; Xia, 2013) . Subsequently, we constructed a phylogenetic tree using these nucleotide sequences of full-length RdRp gene with the maximum likelihood (ML) method under the GTR + I + Γ model of nucleotide substitution as implemented in PhyML (version 3.1) (Guindon et al., 2010) . Optimal model of nucleotide substitution were determined using Akaike Information Criterion (AIC) available in jModelTest (version 2.1.10) (Darriba et al., 2012) . Three main lineages were found from that phylogenetic tree when HKU4-1 (EF065505) was set as a outgroup (Fig. 1A) . The lineage 1, composed of bat SARSr-CoVs from the southwestern provinces including Yunnan, Guizhou and Guangxi with human/civet SARS-CoV. The viruses from other southern regions containing Guangdong, Hong Kong, Hubei and Zhejiang made up the second lineage (lineage 2). The third lineage (lineage 3) consisted of the strains from the central and northern areas such as Hubei, Henan, Shanxi, Shaanxi, Hebei and Jilin. Although SARS first emerged in Guangdong province, the lineage 1 SARSr-CoVs from southwestern China were closer to human SARS-CoV than other provinces in China including Guangdong, indicating Guangdong is unlikely to be the geographical origin of SARS-CoV and the direct progenitor of human SARS-CoV may have originated from lineage 1 (Hu et al., 2017) . Additionally, the SARSr-CoVs from adjacent provinces grouped together (Fig. 1B) , revealing that similar viruses have circulated in the neighboring provinces. In addition, it is also suggested that the bat hosts of SARSr-CoVs from southern China were more diversified than those from other locations. Coronaviruses are single-stranded RNA viruses easy to mutate, which increases the diversity of the species and give them the ability to rapidly adapt to new hosts (Longdon et al., 2014) . Nevertheless, the evolution and development of CoVs were not only the consequence of the coronavirus phylogeny and biology, but also the results of the interaction between CoVs and their hosts (Cui et al., 2007; Graham and Baric, 2010; Longdon et al., 2014; Parrish et al., 2008) . Bats are the only mammals naturally capable of true and sustained flight. The bat tagging exercise had shown that the longest distance of the migration of the Chinese horseshoe bats is 17 km and other Rhinolophus species may migrate up to 30 km for hibernation (Lau et al., 2010) . Such migration distance would help the transmission of SARSr-CoVs carried by bats within a certain geographical range. In order to identify the relationships between bat CoVs and their hosts, a tanglegram was made connecting the RdRp phylogeny of the SARSr-CoVs and the cytochrome b (CytB) phylogeny of their hosts ( Fig. 2; Table S2 ). Different bat species in the same location like Yunnan, Guizhou and Zhejiang harbor closely (caption on next page) related SARSr-CoVs, suggesting the lack of a strict host restriction and the existence of host shift in bat SARSr-CoVs (Cui et al., 2007) . In addition, host shift mostly happened in different species under the same genus Rhinolophus, indicating that genetic distance between hosts as a key factor determines both the host shifts and cross-species transmission. Besides, though from same bat species, the SARSr-CoVs from adjacent provinces clustered, further supporting that the evolution of SARSr-CoVs were restricted by geography rather than by bat species. Recombination plays a significant role in the evolution of virus, which may create emerging virus, expand their host range (Graham and Baric, 2010; Vennema et al., 1998) . Recombination events have been discovered in SARS-CoV and bat SARSr-CoVs (Graham and Baric, 2010; Hon et al., 2008) . The two major recombination hotspots between bat SARSr-CoVs and SARS-CoV are S gene and ORF8, which probably contributes to the variability of the two genes (Hon et al., 2008; Lau et al., 2015; Wu et al., 2016) . All the genomic constituents of SARS-CoV including the hypervariable regions S and ORF8 were discovered from different bat SARSr-CoVs in the same cave in Yunnan, with evidence of recombination events detected between these bat SARSr-CoVs (Hu et al., 2017) , suggesting that human SARS-CoV may originate from the recombinant of bat SARSr-CoVs in this region. The SARSr-CoVs without any deletion at the RBD domain were only identified in Yunnan, so the S genes of human SARS-CoVs were from the recombination of these viruses in Yunnan. As recombination occurs frequently among bat SARSr-CoVs, further genomic characterization of bat SARS-CoVs in a broader range of host species and geographical origin needs to be done to understand the role of recombination plays in the evolution of SARSr-CoVs. As bats have been identified to be the natural reservoirs of various emerging viruses, the concept of zoonotic origin of important viral pathogens becomes widely accepted (Parrish et al., 2008) .Deciphering the evolution of a viral pathogen is vital for us to understand the context of its emergence. Although SARS were controlled and vanished in 2004, those recently identified SARSr-CoVs which are able to use human ACE2 receptor have posed a potential risk of future emergence (Ge et al., 2013; Graham and Baric, 2010; Parrish et al., 2008) . In particular, the serological evidence of bat SARSr-CoV infected in human was discovered in Yunnan, suggesting these viruses may have spilled over to human from bats directly or via other intermediate hosts in Yunnan. Up to present, bat SARSr-CoVs have been discovered in Asia, Europe and Africa (Balboni et al., 2012b; Drexler et al., 2010; He et al., 2014; Lau et al., 2010 Lau et al., , 2005 Li et al., 2005; Ren et al., 2006; Rihtaric et al., 2010; Yang et al., 2013; Yuan et al., 2010) . However, for most of these strains from countries other than China, only partial RdRp fragment were obtained and full-length genome sequences have been determined for only few of them, thus the available genetic information is insufficient to explore the evolution and spread of these SARSr-CoVs. Phylogeny using these short sequences of currently known SARSr-CoVs indicated that the bat SARSr-CoVs from China are closer to human SARS-CoV than those from other countries (Ar Gouilh et al., 2018; Drexler et al., 2010; Quan et al., 2010; Rihtaric et al., 2010) , suggesting Fig. 1 . Phylogenetic analysis of SARS-CoVs and bat SARSr-CoVs. (A) The phylogenetic tree was constructed using the complete RdRp coding sequences and viewed in iTOL (http://itol.embl.de/). All strains here were named using abbreviations of virus ID and sampling provinces. The strain HKU4-1 (NC_009019) was used as a outgroup of that tree. The taxa for lineage 1, lineage 2 and lineage 3 are highlighted in light red, light green and light blue, respectively. The lineage 1, lineage 2 and lineage 3 are displayed with colored pentagrams. The taxa for the only European strain European BM48-31 (GU190215) is displayed by light purple. The branch of SARS-CoV is marked in red. These strains from Zhejiang were collapsed into a triangle named ZJ-SL/Zhejiang. The viruses from Hong Kong also were collapsed into a triangle named HKU3/Hong Kong. The numbers adjacent to the node represents the bootstrap value of 1000 replicates and only bootstrap values ≥70% are shown. that human SARS-CoV may have originated from China. Our analysis revealed that the human SARS-CoV may have originated from south China including Yunnan, Guangxi and Guizhou, and similar viruses likely circulated in these provinces for an extended time period before eventually emerging in humans. In addition, SARSr-CoVs clustered according to their geographical location of sampling, indicating that geographical range overlap between hosts is likely to play an important role in shaping the evolution of these viruses (Faria et al., 2013) . Co-phylogeny analysis indicated the lack of a host restriction and the existence of frequent host shift in bat SARSr-CoVs, mainly occurred in horseshoe bats (genus Rhinolophus), which may be due to that close relatives of the hosts offer a similar environment for the virus to adapt (Longdon et al., 2014) . However, space presents a greater barrier to virus diversification than host species for the evolution of bat SARSr-CoVs. Most importantly, cross-species transmission and frequent recombination of SARSr-CoVs within horseshoe bat populations in Yunnan could eventually lead to the generation of human SARS-CoV (Graham and Baric, 2010; Hon et al., 2008; Hu et al., 2017) . Although Rhinolophus species may migrate up to 30 km (Lau et al., 2010) , it is very unlikely for them to migrate a long distance such as from Yunnan to Guangdong. There are still some gaps needed to be filled in the origin of human SARS-CoV. Given that human SARS-CoV originated from bats in southwestern China including Yunnan, Guangxi and Guizhou, their transmission and migration to Guangdong where human SARS first appeared are unclear and needed to be clarified in the future. Although the serological evidence of bat SARSr-CoV infection was discovered in human living in proximity to the cave where diverse SARSr-CoVs are circulating , it is unable to judge that the SARSr-CoVs infecting those human populations are from bats or other animals inhabiting with bats. In short, it is necessary to carry out continuous surveillance of SARSr-CoVs in different geographical locations targeting different bat species and surrounding animals. SARS-CoV related Betacoronavirus and diverse Alphacoronavirus members found in western old-world The SARS-like coronaviruses: the role of bats and evolutionary relationships with SARS coronavirus A real-time PCR assay for bat SARS-like coronavirus detection and its application to Italian greater horseshoe bat faecal sample surveys Synthetic recombinant bat SARS-like coronavirus is infectious in cultured cells and in mice Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Tracing the SARS-coronavirus Molecular evolution of the SARS coronavirus during the course of the SARS epidemic in China Evolutionary relationships between bat coronaviruses and their hosts jModelTest 2: more models, new heuristics and parallel computing Coronaviridae. Virus Taxonomy: Ninth Report of the International Committee on Taxonomy of Viruses Cooperative involvement of the S1 and S2 subunits of the murine coronavirus spike protein in receptor binding and extended host range Genomic characterization of severe acute respiratory syndrome-related coronavirus in European bats and classification of coronaviruses based on partial RNA-dependent RNA polymerase gene sequences Simultaneously reconstructing viral cross-species transmission history and identifying the underlying constraints Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 Identification of diverse alphacoronaviruses and genomic characterization of a novel severe acute respiratory syndrome-like coronavirus from bats in China Evidence of the recombinant origin of a bat severe acute respiratory syndrome (SARS)-like coronavirus and its implications on the direct ancestor of SARS coronavirus Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus Application of Phylogenetic Networks in Evolutionary Studies Molecular evolution analysis and geographic investigation of severe acute respiratory syndrome coronavirus-like virus in palm civets at an animal market and on farms A novel coronavirus associated with severe acute respiratory syndrome Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats Ecoepidemiology and complete genome comparison of different strains of severe acute respiratory syndrome-related Rhinolophus bat coronavirus in China reveal bats as a reservoir for acute, self-limiting infection that allows recombination events Severe Acute Respiratory Syndrome (SARS) coronavirus ORF8 protein is acquired from SARS-related coronavirus from greater horseshoe bats through recombination Receptor recognition and cross-species infections of SARS coronavirus Bats are natural reservoirs of SARS-like coronaviruses The evolution and genetics of virus host shifts RDP4: detection and analysis of recombination patterns in virus genomes Cross-species virus transmission and the emergence of new epidemic diseases. Microbiol Novel Alphacoronaviruses and Paramyxoviruses cocirculate with Type 1 and Severe Acute Respiratory System (SARS)-related betacoronaviruses in synanthropic bats of luxembourg Family Coronaviridae. In: Viruses Severe acute respiratory syndrome Coronaviruses post-SARS: update on replication and pathogenesis Identification of a severe acute respiratory syndrome coronavirus-like virus in a leaf-nosed bat in Nigeria MERS: emergence of a novel human coronavirus Full-length genome sequences of two SARS-like coronaviruses in horseshoe bats and genetic variation analysis Identification of SARS-like coronaviruses in horseshoe bats (Rhinolophus hipposideros) in Slovenia The N-terminal region of the murine coronavirus spike glycoprotein is associated with the extended host range of viruses from persistently infected murine cells A review of studies on animal reservoirs of the SARS coronavirus Cross-host evolution of severe acute respiratory syndrome coronavirus in palm civet and human Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment Epidemiology, genetic recombination, and pathogenesis of coronaviruses Mutational analysis of aminopeptidase N, a receptor for several group 1 coronaviruses, identifies key determinants of viral host range Feline infectious peritonitis viruses arise by mutation from endemic feline enteric coronaviruses Serological evidence of bat SARS-related coronavirus infection in humans Comparative analysis of twelve genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features Coronavirus diversity, phylogeny and interspecies jumping Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus ORF8-related genetic evidence for Chinese horseshoe bats as the source of human severe acute respiratory syndrome coronavirus DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution Novel SARS-like betacoronaviruses in bats Isolation and characterization of a novel bat coronavirus closely related to the direct progenitor of severe acute respiratory syndrome coronavirus Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications for the origin of SARS coronaviruses in humans Bat severe acute respiratory syndrome-like coronavirus WIV1 encodes an extra accessory protein, ORFX, involved in modulation of the host immune response Epidemiology and cause of severe acute respiratory syndrome (SARS) in Guangdong, People's Republic of China This work was funded by CAS Pioneer Hundred Talents Program to JC, and WIV "One-Three-Five" Strategic Program (WIV-135-TP1) to JC and ZLS. Supplementary data to this article can be found online at https:// doi.org/10.1016/j.meegid.2019.02.001.