key: cord-0857132-a2e7oozu authors: Wang, Wen; Lin, Xian-Dan; Zhang, Hai-Lin; Wang, Miao-Ruo; Guan, Xiao-Qing; Holmes, Edward C; Zhang, Yong-Zhen title: Extensive Genetic Diversity And Host Range Of Rodent-Borne Coronaviruses date: 2020-10-21 journal: Virus Evol DOI: 10.1093/ve/veaa078 sha: 472b75ba5e032b015ee66c0b50fc554bb7ce5aeb doc_id: 857132 cord_uid: a2e7oozu To better understand the genetic diversity, host associations and evolution of coronaviruses (CoVs) in China we analyzed a total of 696 rodents encompassing 16 different species sampled from Zhejiang and Yunnan provinces. Based on reverse transcriptase PCR-based CoV screening of fecal samples and subsequent sequence analysis of the RdRp gene, we identified CoVs in diverse rodent species, comprising Apodemus agrarius, Apodemus chevrieri, Apodemus latronum, Bandicota indica, Eothenomys cachinus, E. miletus, Rattus andamanesis, Rattus norvegicus, and R. tanezumi. CoVs were particularly commonplace in Apodemus chevrieri, with a detection rate of 12.44% (24/193). Genetic and phylogenetic analysis revealed the presence of three groups of CoVs carried by a range of rodents that were closely related to the Lucheng Rn rat coronavirus (LRNV), China Rattus coronavirus HKU24 (ChRCoV_HKU24) and Longquan Rl rat coronavirus (LRLV) identified previously. One newly identified A. chevrieri-associated virus closely related to LRNV lacked an NS2 gene. This virus had a similar genetic organization to AcCoV-JC34, recently discovered in the same rodent species in Yunnan, suggesting that it represents a new viral subtype. Notably, additional variants of LRNV were identified that contained putative nonstructural NS2b genes located downstream of the NS2 gene that were likely derived from the host genome. Recombination events were also identified in the ORF1a gene of Lijiang-71. In sum, these data reveal the substantial genetic diversity and genomic complexity of rodent-borne CoVs, and extend our knowledge of these major wildlife virus reservoirs. 4 reservoirs for members of the subgenus Embecovirus (Betacoronavirus) and have likely played a key role in coronavirus evolution and emergence. Zhejiang province is located in the southern part of the Yangtze River Delta on the southeast coast of China, from where rodent CoVs have previously been reported (Wang et al. 2015; Lin et al. 2017) . Yunnan province is located in southern China, bordering the countries of Myanmar, Laos, and Vietnam, and is often caused the "the Kingdom of Wildlife". A previous study from Yunnan identified a novel SARS-like CoV, Rs-betacoronavirus/Yunnan2013, whose ORF8 was nearly identical to ORF8 of SARS-CoVs (98% nt sequence identity) (Wu et al. 2016) . Recently, two CoVs closely related to SARS-CoV-2 were identified in Rhinolophus sp. (i.e. horseshoe) bats sampled from Yunnan province: RaTG13 (Zhou et al. 2020b) and RmYN02 (Zhou et al. 2020a) . However, few rodent CoVs has been documented in Yunnan to date. To explore the diversity and characterization of CoVs in rodents, we performed a molecular evolutionary investigation of CoVs in Zhejiang and Yunnan provinces, China. Our results revealed a remarkable diversity of CoVs in rodents. This study was reviewed and approved by the ethics committee of the National Institute for Communicable Disease Control and Prevention of the Chinese CDC. All animals were kept alive after capture and treated in strictly according to the guidelines for the Laboratory Animal Use and Care from the Chinese CDC and the Rules for the Implementation of Laboratory Animal Medicine (1998) from the Ministry of Health, China, under the protocols approved by the National Institute for Communicable Disease Control and Prevention. All dissection was performed under ether anesthesia, and every effort was made to minimize suffering. All rodents were collected in 2014 and 2015 from Lijiang and Ruili cities in Yunnan province, and Longquan and Wenzhou cities in Zhejiang province, China. Sampling occurred https://mc.manuscriptcentral.com/vevolu 5 in cages using fried food as bait set in the evening and checked the following morning. Animals were initially identified by trained field biologists, and further confirmed by sequence analysis of the mitochondrial (mt)-cyt b gene (Guo et al. 2013) . Lung samples were collected from animals for the classification of rodent species and alimentary tract samples were collected from animals for the detection of CoVs, respectively. Total DNA was extracted by using the Cell & Tissue Genomic DNA Extraction Kit (Bioteke Corporation, Beijing, China) from lung samples of rodents according to the manufacturer's protocol. Total RNA was extracted from fecal samples using the fecal total RNA extraction kit (Bioteke Corporation, Beijing, China) according to the manufacturer's protocol. The RNA was eluted in 50μl RNase-free water and was used as a template for further detection. The mt-cyt b gene (1140 bp) was amplified by PCR using universal primers for rodents as described previously (Guo et al. 2013) . CoV screening was performed using a previously published primer set by a pan-coronavirus nested PCR targeted to a conserved region of the RNA-dependent RNA polymerase gene (RdRp) gene (Wang et al. 2015) . First-round reverse transcription PCR (RT-PCR) was conducted by using PrimeScript One Step RT-PCR Kit Ver.2 (TaKaRa, Dalian, China). A 10 μL reaction mixture contained 5 μL of 2 X 1 Step Buffer, 0.4 μL PrimeScript 1 Step Enzyme Mix, 0.3 μL (10μmol/l) forward primer, 0.3 μL (10μmol/l)) reverse primer, 3.5 μL RNase Free dH 2 O, and 0.5 μL of sample RNA. The PCR cycler conditions for the amplification were 50°C for 30 min (reverse transcription) then 95°C for 3 min, 35 cycle of 94°C for 45 s (denaturation), 44°C for 45 s (annealing), 72°C for 45 s (extension), then 72°C for 10 min (final extension). The PCR product was then put through a second round PCR which amplify a final PCR product of approximately 450bp. To recover complete viral genomes, RNA was amplified using several sets of degenerate primers designed by multiple sequence alignments of published CoV genomes. Additional primers were designed according to results of the first and subsequent rounds of sequencing. The 5' and 3' end of the viral genome was amplified by rapid amplification of cDNA ends by https://mc.manuscriptcentral.com/vevolu 6 using the 5' and 3' Smarter RACE kit (TaKaRa, Dalian, China) . RT-PCR products of expected size were subject to Sanger sequencing performed by the Sangon corporation (Beijing, China) . Amplicons of more than 700 bp were sequenced in both directions. Sequences were assembled by SeqMan and manually edited to produce the final sequences of the viral genomes. Nucleotide (nt) sequence similarities and deduced amino acid (aa) similarities to NCBI/GenBank database sequences were determined using BLASTn and BLASTp. CoV reference sequences sets representing the RdRp, S and N genes were downloaded from GenBank. Both partial RdRp gene sequences and complete amino acid sequences of the RdRp, S and N genes were used to infer phylogenetic trees. All viral sequences were aligned using the L-INS-i algorithm within the MAFFT program (Katoh and Standley 2013) . After alignment, gaps and ambiguously aligned regions were removed using Gblocks (v0.91b) with a minimum block length of 10 and no gap positions (Talavera and Castresana 2007). The best-fit model of nucleotide substitution was determined using jModelTest version 0.1 (Posada 2008). Phylogenetic trees were generated using the maximum likelihood (ML) method implemented in PhyML v3.0 (Guindon et al. 2010 ). Potential recombination events in the history of the LRNV, LRLV and ChRCoV_HKU24 were assessed using both the RDP4 (Martin et al. 2010) and Simplot (v.3.5 .1) programs. The RDP4 analysis was conducted based on an analysis of complete genome sequences, using the RDP, GENECONV, BootScan, maximum chi square, Chimera, SISCAN and 3SEQ methods. Putative recombination events were identified with a Bonferroni corrected P-value cut-off of 0.01. Similarity plots were inferred using Simplot to further characterize potential recombination events, including the location of possible breakpoints. During 2014 and 2015 a total of 696 rodents from 16 different species were captured in residential areas, farmland and woodland regions from Lijiang city, Ruili city, Yunnan province and Longquan city, Wenzhou city, Zhejiang province ( Figure 1 and To better understand the relationship between viruses, their hosts and their geographic distributions, we performed a phylogenetic analysis of partial RdRp (381bp) (Figure 2A Zhejiang province, China (Wang et al., 2015) . The data generated here, along with that published previously, indicate that a total of 37 https://mc.manuscriptcentral.com/vevolu 8 different species of rodents from nine different countries are currently known to harbor CoVs ( Figure 2) . Notably, every virus clade contained different rodent species, sometimes even from different subfamilies, such that there was no rigid host restriction in rodent CoVs To better characterize the CoVs found in this study, complete or nearly complete genome RtClan-CoV/GZ2015 and RtMruf-CoV-1/JL2014 and contains two putative non-structural proteins -NS2 and NS2b -located between the ORF1ab and S gene. The putative NS2 gene of the third variant is 828 nt in length, with 82.9-88.9% sequence identity. Similarly, the putative NS2b has a gene length of 462 nt and exhibits 77.5-96.3% sequence identity among these three viruses. Strikingly, a blastp search reveals that the NS2b encodes a putative nonstructural protein of 153 amino acid residues in length that has no amino acid sequence similarity to other coronaviruses; rather, this sequence exhibits ~43% amino acid identity to the C-type lectin-like protein within the rodent Microtus ochrogaster genome. Hence, this pattern suggests that the NS2b gene may have originally been acquired from a rodent host genome during evolutionary history. Moreover, the amino acid sequence identity between Lijiang-170, Lijiang-71, Ruian-83 and LRNV was greater than 90% in RdRp, E, and M genes (as expected from members of the same species), but only 70%-88% in ADRP, 3CLpro, ORF1ab and S (Table 2 ). Further analysis of the characteristics of Lijiang-170, for which a complete genome sequence is available, shows that it has similar transcription regulatory sequence (TRS) to AcCoV-JC34 (Table 3) . Hence, Lijiang-170 and AcCoV-JC34 may represent a novel subtype of LRNV that exhibits marked differences to the prototype strain Lucheng-19. In contrast, Lijiang-53, Lijiang-41, Ruili-874, Longquan-723 were most closely related to ChRCoV_HKU24, exhibiting 94.0%-96.1% nt sequence similarity. Strikingly, the length of nsp3 in Lijiang-41 differed from those of ChRCoV_HKU24 as a result of a 75 nt deletion. Ruili-66 was most closed to LRLV and shared 92.7% nt sequence similarity with Longquan-370 and Longquan-189 of LRLV found in Longquan. To better understand the evolutionary relationships among the CoVs described here and those https://mc.manuscriptcentral.com/vevolu identified previously, we estimated phylogenetic trees based on the amino acid sequences of the RdRp, S and N proteins. The analysis of all three proteins from the LRNV clade again suggests that LRNV can be divided into two phylogenetic subtypes (I and II); indicated on Figure 5 ). Indeed, there was a clear division phylogenetic between the subtype I and II LRNV sequences in the RdRp, S and N amino acid trees, and while intra-subtype (I or II) sequences shared high nucleotide sequence identities (92.4% -97.7%), inter-subtype sequence identity was only ~77.5%. Notably, this phylogenetic analysis also suggested that clade LRNV had a recombinant evolutionary history: while the LRNV clade formed a distinct lineage in the RdRp and N gene trees (although with little phylogenetic resolution in the latter), it clustered with Rhinolophus sinicus coronavirus HKU2, BtRf-AlphaCoV/YN2012 and Sm-CoV X74 in the S gene tree. In contrast, the clade 2 and clade 3 rodent CoVs were consistently closely related to LRLV and ChRCoV_HKU24 in the RdRp, S and N amino acid trees. Multiple methods within the RDP program (Martin et al. 2010) identified statistically significant recombination events in Lijiang-71 (p <3.05×10 -23 to p <7.11×10 -13 ) ( Figure 6 ). When Lijiang-71 was used as the query for sliding window analysis with RtClan-CoV/GZ2015 and Lucheng-19 as potential parental sequences, four recombination breakpoints at nucleotide positions 8,188, 8,636, 9,030 and 11,251 in the sequence alignment were identified. This pattern of recombination events is further supported by phylogenetic and similarity plot analyses ( Figure 6 ). Specifically, in the major parental region (1-8,187, 8,637-9,029 and 11,252-29,349) , Lijiang-71was most closely related to RtClan-CoV/GZ2015, while in the minor parental region (8188-8636 and 9,030-11,251) it was more closely related to Lucheng-19. We screened CoVs in 696 rodents from 16 different species sampled at four sites in Zhejiang and Yunnan provinces, China. Overall positivity rates were approximately 6%, although they ranged from 9.3% in Lijiang city to only 1.2% in Zhejiang province. The latter is lower than https://mc.manuscriptcentral.com/vevolu the CoV detection rates described in a previous study undertaken in Zhejiang province despite the use of similar methodologies (Wang et al., 2015) . We found that A. chevrieri had a relatively high CoV detection rate (24/193, 12.44%) in Lijiang city, Yunnan province, consistent with a previous study showing that A. chevrieri had a high detection rate of CoV (21/98, 21.4%) in Jianchuan county, also in Yunnan province (Ge et al. 2017) . As A. chevrieri is a major species in Lijiang city, such a high coronavirus infection rate highlights the need for ongoing surveillance. In conclusion, our study revealed a high diversity of CoVs circulating in rodents from Yunnan and Zhejiang provinces, China, including the discovery of a putative novel viral subtype and new rodent host species. Undoubtedly, the larger scale surveillance and analyses of CoV infections in rodents is required to better understand their genetic diversity, cellular receptors, inter-host transmission and evolutionary history. The eight complete or near complete CoVs genome sequences, the short RdRp sequences, Sc -B at Co V_ 51 2 L ij ia n g -2 1 1 R tR f-C o V /G X 2 0 1 6 Lij ian g-6 4 L o n g q u a n -6 1 5 Coronaviruses in bats from Mexico A murine virus (JHM) causing disseminated encephalomyelitis with extensive destruction of myelin Family Coronaviridae Genomic characterization of severe acute respiratory syndrome-related coronavirus in European bats and classification of coronaviruses based on partial RNA-dependent RNA polymerase gene sequences Insectivorous bats carry host specific astroviruses and coronaviruses across different regions in Germany Detection of alpha-and betacoronaviruses in rodents from Yunnan, China' Viruses as vectors of horizontal transfer of genetic material in eukaryotes New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0' Phylogeny and origins of hantaviruses harbored by bats, insectivores, and rodents Infection of the Cloaca with the Virus of Infectious Bronchitis MAFFT multiple sequence alignment software version 7: improvements in performance and usability Discovery of a novel coronavirus, China Rattus coronavirus HKU24, from Norway rats supports the murine origin of Betacoronavirus Apodemus latronum Al MW011460/MW023788 China ChRCoV_HKU24 Lijiang-41 Apodemus latronum Al MT820628/MW023787 China ChRCoV_HKU24 RtAp-CoV/Tibet2014 Apodemus peninsulae Ap KY370047 China ChRCoV_HKU24 RtAp-CoV/SAX2015 Apodemus peninsulae Ap KY370064 China ChRCoV_HKU24 Ruili-888 Eothenomys cachinus Ec MW011470/MW023791 China ChRCoV_HKU24 Lijiang-14 Eothenomys miletus Emi MW011456/MW023793 China ChRCoV_HKU24 RtNe-CoV/Tibet2014 Niviventer eha Ne KY370071 China ChRCoV_HKU24 RtNn-CoV/Tibet2014 Rhabdomys pumilio Rp KM888151 South Africa LRLV Ruili-66 Bandicota indica Bi MT820632/MW023789 67 Bandicota indica Bi MW011472/MW023790 China https://mc.manuscriptcentral.com/vevolu