key: cord-270698-9w3ap3gz authors: Guo, Hua; Hu, Bing-Jie; Yang, Xing-Lou; Zeng, Lei-Ping; Li, Bei; Ouyang, Song-Ying; Shi, Zheng-Li title: Evolutionary arms race between virus and host drives genetic diversity in bat SARS related coronavirus spike genes date: 2020-05-13 journal: bioRxiv DOI: 10.1101/2020.05.13.093658 sha: doc_id: 270698 cord_uid: 9w3ap3gz The Chinese horseshoe bat (Rhinolophus sinicus), reservoir host of severe acute respiratory syndrome coronavirus (SARS-CoV), carries many bat SARS-related CoVs (SARSr-CoVs) with high genetic diversity, particularly in the spike gene. Despite these variations, some bat SARSr-CoVs can utilize the orthologs of human SARS-CoV receptor, angiotensin-converting enzyme 2 (ACE2), for entry. It is speculated that the interaction between bat ACE2 and SARSr-CoV spike proteins drives diversity. Here, we have identified a series of R. sinicus ACE2 variants with some polymorphic sites involved in the interaction with the SARS-CoV spike protein. Pseudoviruses or SARSr-CoVs carrying different spike proteins showed different infection efficiency in cells transiently expressing bat ACE2 variants. Consistent results were observed by binding affinity assays between SARS- and SARSr-CoV spike proteins and receptor molecules from bats and humans. All tested bat SARSr-CoV spike proteins had a higher binding affinity to human ACE2 than to bat ACE2, although they showed a 10-fold lower binding affinity to human ACE2 compared with their SARS-CoV counterpart. Structure modeling revealed that the difference in binding affinity between spike and ACE2 might be caused by the alteration of some key residues in the interface of these two molecules. Molecular evolution analysis indicates that these residues were under strong positive selection. These results suggest that the SARSr-CoV spike protein and R. sinicus ACE2 may have coevolved over time and experienced selection pressure from each other, triggering the evolutionary arms race dynamics. It further proves that R. sinicus is the natural host of SARSr-CoVs. Importance Evolutionary arms race dynamics shape the diversity of viruses and their receptors. Identification of key residues which are involved in interspecies transmission is important to predict potential pathogen spillover from wildlife to humans. Previously, we have identified genetically diverse SARSr-CoV in Chinese horseshoe bats. Here, we show the highly polymorphic ACE2 in Chinese horseshoe bat populations. These ACE2 variants support SARS- and SARSr-CoV infection but with different binding affinity to different spike proteins. The higher binding affinity of SARSr-CoV spike to human ACE2 suggests that these viruses have the capacity of spillover to humans. The positive selection of residues at the interface between ACE2 and SARSr-CoV spike protein suggests a long-term and ongoing coevolutionary dynamics between them. Continued surveillance of this group of viruses in bats is necessary for the prevention of the next SARS-like disease. The first and essential step of virus infection is cell receptor recognition. The entry 83 of the coronavirus is mediated by specific interactions between the viral S protein and 84 cell surface receptor, followed by fusion between the viral and host membrane. The 85 coronavirus S protein is functionally divided into two subunits, a cell attachment 86 subunit (S1) and a membrane-fusion subunit (S2). The S1 region contains an 87 N-terminal domain (NTD) and a C-terminal domain (CTD); both can be used for 88 coronavirus receptor binding (RBD) (20) . For SARS-CoV, its S1-CTD serves as an 89 RBD for binding to the cellular receptor, angiotensin-converting enzyme 2 (ACE2) 90 (21). Biochemical and crystal structure analyses have identified a few key residues in 91 the interface between the SARS-CoV S-RBD and human ACE2 (21-23). have a smaller S protein, due to 5, 12, or 13 amino acid deletions (17, 19) . Despite the 96 variations in the RBD, all clade 1 strains can use ACE2 for cell entry, whereas clade 2 97 strains, with deletions cannot (14, 16, 17) . These results suggest that members of 98 clade 1 are likely to be the direct source of SARS-CoV in terms of genome similarity 99 and ACE2 usage. Samples from three provinces (Hubei, Guangdong, and Yunnan) were used for 123 ACE2 amplification, based on the prevalence of bat SARSr-CoVs and tissue sample 124 availability and quality. In addition to previously sequenced bat ACE2 by our group 125 (sample ID 832, 411, and 3357, collected from Hubei, Guangxi, and Yunnan, 126 8 respectively) and others (GenBank accession no. ACT66275; sample collected from 127 Hong Kong), we obtained ACE2 gene sequences from 21 R. sinicus bat individuals: 128 five from Hubei, nine from Guangdong, and seven from Yunnan. The ACE2 129 sequences exhibited 98-100% amino acid (aa) identity within their species and 80-81% 130 aa identity with human ACE2 (Table S1 ). Major variations were observed at the 131 N-terminal region, including in some residues which were previously identified to be 132 in contact with SARS-CoV S-RBD ( Fig. 1A and Fig. S1 ). Analysis based on 133 nonsynonymous SNPs helped identify eight residues, including 24, 27, 31, 34, 35, 38, 134 41, and 42. The combination of these 8 residues produced eight alleles, including 135 RIESEDYK, LIEFENYQ, RTESENYQ, RIKSEDYQ, QIKSEDYQ, RMTSEDYQ, 136 EMKTKDHQ, and EIKTKDHQ, named allele 1-8, respectively (Fig. 1A) . In 137 addition to the ACE2 genotype data from previous studies (allele 4, 7, and 8), five 138 novel alleles were identified in the R. sinicus populations in this study. Alleles 2 and 4 139 were found in two and three provinces, respectively, whereas the other alleles seemed 140 to be geographically restricted. In summary, three alleles (4, 6, and 8) were found in 141 Guangdong, four (1, 2, 4, and 7) in Yunnan, three (2, 4, and 5) in Hubei, and one each 142 in Guangxi and Hong Kong. Coexistence of four alleles was found in the same bat 143 cave of Yunnan where the direct progenitor of SARS-CoV was found (Fig. 1B) . Taken 144 together, these data suggest that ACE2 variants have been circulating within the R. Similar to our previous report, all four bat SARSr-CoV strains with the same 176 genomic background but different S proteins could use human ACE2 and replicate at 177 similar levels (17). However, there are some differences in how they utilize R. sinicus 178 ACE2s ( Fig. 2 and Fig. S2 ). All test viruses could efficiently use allele 1, 2, 4, 5 for 179 entry. RsWIV1 and RsWIV16, which share an identical RBD, could not use allele 6 180 (sample ID 1434) from Guangdong. Rs4231 and RsSHC014, which share an identical 181 RBD, could not use allele 7 (sample ID 3357) and 8 (sample ID 1438) from Yu nnan 182 and Guangdong, respectively. SARS-CoV-BJ01, which shares high similarity with 183 WIV1 and WIV16 RBD, was able to use same bat ACE2 alleles as Rs4231 and 184 11 RsSHC014 in the pseudotyped infection assay ( Fig. 2 and Fig. S3 ). These results 185 indicate that cell entry was affected by both spike RBD and R. sinicus ACE2 variants. 1434ACE2 (allele 6) was found to bind RsSHC014 and BJ01 but not RsWIV1 RBD; 207 3357ACE2 (allele 7) was found to bind RsWIV1 but not RsSHC014 and BJ01 RBD; 208 5720ACE2 (allele 2) was found to bind all tested RBDs. All tested RBDs had a high 209 binding affinity to human or bat ACE2. BJ01 RBD had a higher binding affinity for 210 human ACE2 than did RsWIV1 and RsSHC014 RBDs (Fig. 3A , E, and I); however, 211 it had a lower binding affinity to bat ACE2 than the two bat SARSr-CoV RBDs ( The four tested spike proteins of bat SARSr-CoV are identical in size and share 227 over 90% aa identity with SARS-CoV, which suggests that these proteins have a 228 similar structure. In this study, we built structural complex models of bat 229 SARSr-CoV-RsWIV1 RBD with R. sinicus ACE2 3357 (allele 7) and RsSHC014 230 RBD with R. sinicus ACE2 1434 (allele 6), in concordance with the results of the 231 binding affinity assay between SARS-CoV RBD and human ACE2 (Fig. 4) . 232 Compared with the contact residues in the interface between SARS-CoV RBD and In R. sinicus ACE2-1434, we found a threonine at 31, unlike human ACE2, which 247 has a lysine at this position (Fig. 4) . Therefore, both RsWIV1 and RsSHC014 RBD had a lower binding affinity 268 to human ACE2 than did BJ01, but they both showed a higher binding affinity with Table 1) , 17 of 298 those were found to be located on the RBD region, which faces its receptor ACE2, 299 according to the crystal structure (Fig. S5) . Moreover, five of those (442, 472, 479, 300 480, and 487), present in the SARS-CoV spike, have been previously identified to 301 have a significant impact on binding affinity to human ACE2 (Fig. 4 (24, 27, 31, 34, 35, 38, 41, and 42) correspond to 306 the residues in human ACE2, which were previously identified to be involved in 307 18 direct contact with the human SARS-CoV spike protein (Fig. 4, Fig. S5 ) (32). We also 308 analyzed the ACE2 gene of Rhinolophus affinis (R. affinis), which has been reported 309 to carry SARSr-CoV occasionally (15). Used an alignment of 23 ACE2 gene 310 sequences from R. affinis obtained in this study, we found that R. affinis ACE2 was 311 more conserved between different individuals in the entire coding region than R. 312 sinicus ACE2 (Fig. S6) and no obvious positive selection sites were observed (data Genes with important functions usually display a dN/dS ratio of less than 1 353 (negative selection) because most amino acid alterations in a protein are deleterious. In a host-virus arms race situation, the genes involved tend to display dN/dS ratios Codon-based analysis of molecular evolution 536 Bat ACE2 and SARSr-CoV spike sequences were analyzed for positive selection. In 537 this study, bat ACE2 sequences were either amplified or downloaded from NCBI and 538 SARSr-CoV spike sequences were downloaded from NCBI; the database accession 539 numbers are listed in Table S2 . Sequences were aligned in Clustal X. Phylogenetic Table S2 . Discovery of seven novel Mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of 580 alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of 581 gammacoronavirus and deltacoronavirus Epidemiology and cause of severe acute respiratory 586 syndrome (SARS) in Guangdong, People's Republic of China Identification of a novel coronavirus in patients with severe 592 acute respiratory syndrome A novel coronavirus associated with severe acute respiratory syndrome Isolation 599 of a Novel Coronavirus from a Man with Pneumonia in Saudi Arabia A pneumonia outbreak 604 associated with a new coronavirus of probable bat origin Isolation and 607 characterization of viruses related to the SARS coronavirus from animals in southern China Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats Bats are natural reservoirs 614 of SARS-like coronaviruses Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications 617 for the origin of SARS coronaviruses in humans SARS-Coronavirus ancestor's foot-prints in South-East Asian bat colonies and the refuge 620 theory Genomic characterization of severe acute respiratory 624 syndrome-related coronavirus in European bats and classification of coronaviruses based on 625 partial RNA-dependent RNA polymerase gene sequences Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Identification of diverse alphacoronaviruses and genomic characterization of a novel 632 severe acute respiratory syndrome-like coronavirus from bats in China Isolation and Characterization of a Novel Bat Coronavirus Closely Related to 635 the Direct Progenitor of Severe Acute Respiratory Syndrome Coronavirus Discovery of a rich gene pool of 638 bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus Diversity of 643 coronavirus in bats from Eastern Thailand Origin and evolution of pathogenic coronaviruses Rules of engagement: molecular insights from host-virus 647 arms races Cross-host evolution of severe 653 acute respiratory syndrome coronavirus in palm civet and human Receptor and viral 657 determinants of SARS-coronavirus adaptation to human ACE2 Conformational states of the 659 severe acute respiratory syndrome coronavirus spike protein ectodomain Angiotensin-converting enzyme 2 is an essential regulator of heart 664 function Evidence for ACE2-utilizing coronaviruses (CoVs) 666 related to severe acute respiratory syndrome CoV in bats Angiotensin-converting enzyme 2 668 (ACE2) proteins of different bat species confer variable susceptibility to SARS-CoV entry Identification of key amino acid 671 residues required for horseshoe bat angiotensin-I converting enzyme 2 to function as a 672 receptor for severe acute respiratory syndrome coronavirus MEGA6: Molecular Evolutionary 674 Genetics Analysis version 6.0 Receptor Adaptation by Severe Acute Respiratory Syndrome Coronavirus Receptor recognition and cross-species infections of SARS coronavirus PAML 4: phylogenetic analysis by maximum likelihood Structure of SARS coronavirus spike receptor-binding 683 domain complexed with receptor Two-stepping through time: mammals and viruses Molecular evolution of the SARS coronavirus during the course of the 687 SARS epidemic in China Mutation of a single 689 residue renders human tetherin resistant to HIV-1 Vpu-mediated depletion Two loci controlling genetic cellular 692 resistance to avian leukosis-sarcoma viruses A glycan 694 shield on chimpanzee CD4 protects against infection by primate lentiviruses CD4 receptor diversity in 702 chimpanzees protects against SIV infection Dual host-virus arms races 705 shape an essential housekeeping protein Parasite Transmission Modes and the Evolution of Virulence Coevolution of Host and Pathogen Identification of a 711 34 cellular receptor for subgroup E avian leukosis virus Human genes that limit AIDS Persistent infection promotes 715 cross-species transmissibility of mouse hepatitis virus A mouse-adapted SARS-coronavirus causes 718 disease and mortality in BALB/c mice Amino acid substitutions in the S2 subunit of mouse hepatitis 720 virus variant V51 encode determinants of host range expansion Mechanisms of 722 zoonotic severe acute respiratory syndrome coronavirus host range expansion in human 723 airway epithelium Mechanisms of host receptor adaptation by 725 severe acute respiratory syndrome coronavirus Adaptive Evolution of MERS-CoV to Species Variation in DPP4 Filovirus receptor NPC1 732 contributes to species-specific patterns of ebolavirus susceptibility in bats Human Adaptation of Ebola Virus during the West African 735 Outbreak Ebola Virus Glycoprotein with Increased Infectivity Dominated the Characterization of a filovirus (Mengla virus) 742 from Rousettus bats in China Trilogy of ACE2: A peptidase in the 744 renin-angiotensin system, a SARS receptor Angiotensin-converting enzyme 2 is an essential regulator of heart 749 function Retroviruses pseudotyped with the severe acute respiratory 752 syndrome coronavirus spike protein efficiently infect cells expressing angiotensin-converting 753 enzyme 2 Bat Severe Acute Respiratory Syndrome-Like Coronavirus WIV1 Encodes an Extra 756 Accessory Protein, ORFX, Involved in Modulation of the Host Immune Response Bat origins of MERS-CoV supported by bat coronavirus HKU4 760 usage of human receptor CD26 Exploring host-pathogen interactions through genome wide 765 protein microarray analysis Identification of Host