key: cord-0689482-z2kgcjtp authors: Lau, Susanna K.P.; Woo, Patrick C.Y.; Li, Kenneth S.M.; Huang, Yi; Wang, Ming; Lam, Carol S.F.; Xu, Huifang; Guo, Rongtong; Chan, Kwok-hung; Zheng, Bo-jian; Yuen, Kwok-yung title: Complete genome sequence of bat coronavirus HKU2 from Chinese horseshoe bats revealed a much smaller spike gene with a different evolutionary lineage from the rest of the genome date: 2007-10-25 journal: Virology DOI: 10.1016/j.virol.2007.06.009 sha: f7d3a0a77b4cc62249d53edf56e4b5a43a0b7699 doc_id: 689482 cord_uid: z2kgcjtp Apart from bat-SARS-CoV, we have identified a novel group 1 coronavirus, bat-CoV HKU2, in Rhinolophus sinicus (Chinese horseshoe bats). Since it has been suggested that the receptor-binding motif (RBM) of SARS-CoV may have been acquired from a group 1 coronavirus, we conducted a surveillance study and identified bat-SARS-CoV and bat-CoV HKU2 in 8.7% and 7.5% respectively of R. sinicus in Hong Kong and Guangdong. Complete genome sequencing of four strains of bat-CoV HKU2 revealed the smallest coronavirus genome (27164 nucleotides) and a unique spike protein evolutionarily distinct from the rest of the genome. This spike protein, sharing similar deletions with other group 2 coronaviruses in its C-terminus, also contained a 15-amino acid peptide homologous to a corresponding peptide within the RBM of spike protein of SARS-CoV, which was absent in other coronaviruses except bat-SARS-CoV. These suggest a common evolutionary origin in the spike protein of bat-CoV HKU2, bat-SARS-CoV, and SARS-CoV. Coronaviruses can infect a wide variety of animals in which they can cause respiratory, enteric, hepatic and neurological diseases of varying severity. Based on genotypic and serological characteristics, coronaviruses were classified into three distinct groups (Brian and Baric, 2005; Lai and Cavanagh, 1997; Ziebuhr, 2004) . As a result of the unique mechanism of viral replication, coronaviruses have a high frequency of recombination (Lai and Cavanagh, 1997) . Such a high recombination rate, coupled with the infidelity of the polymerases of RNA viruses, may allow them to adapt to new hosts and ecological niches (Herrewegh et al., 1998; Woo et al., 2006c) . The severe acute respiratory syndrome (SARS) epidemic in 2003, caused by a novel coronavirus, SARS coronavirus (SARS-CoV), has aroused interests in the discovery of novel coronaviruses in both humans and animals Marra et al., 2003; Peiris et al., 2003; Rota et al., 2003; Woo et al., 2004) . Before that, only 19 (two human, 13 mammalian and four avian) coronaviruses were known. After the epidemic, two novel human coronaviruses, human coronavirus NL63 (HCoV-NL63), a group 1 coronavirus, and coronavirus HKU1 (CoV-HKU1), a group 2 coronavirus, have been discovered (Fouchier et al., 2004; Lau et al., 2006; van der Hoek et al., 2004; Woo et al., 2005a Woo et al., , 2005b . In the recent two years, at least 10 previously unrecognized coronaviruses from bats were also described in Hong Kong and mainland China Li et al., 2005b; Poon et al., 2005; Tang et al., 2006; Woo et al., 2006a Woo et al., , 2006d , suggesting that bats play an important role in the ecology and evolution of coronaviruses. Although the identification of SARS-CoV-like viruses in Himalayan palm civets and raccoon dogs in live-animal markets in southern China suggested that wild animals could be the origin of SARS , the absence of related viruses in wild civets in extensive surveillance studies and the rapid evolution of SARS-CoV genomes in market civets suggested that these caged animals were likely only intermediate hosts and there is a yet unidentified natural reservoir for SARS-CoV (Li et al., 2005a; Song et al., 2005; Tu et al., 2004; Yang et al., 2005) . Recently, we have described the discovery of a SARS-CoV-like virus, bat SARS coronavirus (bat-SARS-CoV), in Chinese horseshoe bats in Hong Kong . Similar viruses have also been found in other species of horseshoe bats in mainland China (Li et al., 2005b) , suggesting that horseshoe bats are reservoir of SARS-CoV-like viruses. However, genome sequence comparison of SARS-CoV-like coronaviruses from horseshoe bats and human/civet SARS-CoV showed that they shared only 88-92% nucleotide identities. More importantly, the amino acid sequence identities between the spike (S) proteins of bat and human/civet viruses were only 78-80% Li et al., 2005b; Ren et al., 2006) . Therefore, events such as mutation and/or recombination would have occurred during the evolution of these SARS-CoV-like viruses before the possible emergence of direct progenitors of SARS-CoV capable of infecting palm civets and subsequently humans. In a recent report on angiotensin-converting enzyme 2 (ACE2)-S protein interactions of SARS-CoV, it was suggested that the receptor-binding motif (RBM) of SARS-CoV may have been acquired from a group 1 virus related to HCoV-NL63 . Interestingly, a novel group 1 coronavirus, bat coronavirus HKU2 (bat-CoV HKU2), was identified in Chinese horseshoe bats in addition to bat-SARS-CoV in our previous surveillance studies Woo et al., 2006d) . To better understand the epidemiology and evolution of bat-CoV HKU2 and explore possible recombination events between this group 1 coronavirus and bat-SARS-CoV that could have led to the emergence of SARS-CoV, we conducted an extensive surveillance for coronaviruses in Chinese horseshoe bats in Hong Kong and Guangdong, the province in southern China where the SARS epidemic originated, over a 2-year period. Four complete genomes of bat-CoV HKU2, three from Hong Kong and one from Guangdong, were also sequenced and analyzed. Comparison of bat-CoV HKU2 genomes with other coronavirus genomes revealed a spike protein distinct from the spike proteins of other group 1 coronaviruses, with a peptide homologous to a segment of the RBM of the S protein of SARS-CoV. A total of 770 respiratory and alimentary specimens from 348 and 64 Chinese horseshoe bats were obtained in Hong Kong and in the Guangdong province in Southern China, respectively. RT-PCR for a 440-bp fragment in the RdRp genes of coronaviruses was positive in alimentary specimens from 58 (16.7%) of the 348 bats from Hong Kong, and from 8 (12.5%) of the 64 bats from Guandong. None of the respiratory specimens was positive. Sequencing results suggested the presence of two different coronaviruses among the 64 positive bats. Of the 58 positive bats from Hong Kong, the sequences of 29 samples possessed ≥ 99% nucleotide identities to bat-CoV HKU2 (GenBank accession no. DQ249235), while those of the other 29 samples possessed ≥ 99% nucleotide identities to bat-SARS-CoV (GenBank accession no. DQ022305) Woo et al., 2006d) . The bats positive for bat-CoV HKU2 and bat-SARS-CoV were from nine of the 18 sampling locations in Hong Kong, with bats from three locations harboring both viruses (Fig. 1) . Of the eight positive bats from Guangdong, the sequences of six alimentary samples possessed 97-98% nucleotide identities to bat-CoV HKU2, while that of one possessed 98% nucleotide identities to bat-SARS-CoV. The remaining positive sample contained both bat-CoV HKU2 and bat-SARS-CoV with 98% nucleotide identities. Attempts to stably passage bat-CoV HKU2 in cell lines were unsuccessful. Complete genome sequence data of four strains of bat-CoV HKU2 were obtained by assembly of the sequences of the RT-PCR products obtained directly from four individual specimens collected at different time and places. Three strains were obtained from Hong Kong (bat-CoV HKU2/HK/33/2004, bat-CoV HKU2/HK/298/2004 and bat-CoV HKU2/HK/46/2006) ( Fig. 1) , while one was obtained from Guangdong (bat-CoV HKU2/GD/430/2006). Their genomes were 27,164-nucleotide, polyadenylated RNA, the smallest genome size among all coronaviruses with genome sequences available (Table 1 and Fig. 2 ). The G + C content was 39% (Table 1 ). The four strains share the same genome structures and were highly similar in their nucleotide sequence. The three Hong Kong strains were more closely related to each other with 99.9% overall nucleotide identities, while that from Guangdong had 98.5% nucleotide identities with the three Hong Kong strains. Their genome organization was similar to other coronaviruses (Table 2 and Fig. 2 ). Bat-CoV HKU2 possessed the putative transcription regulatory sequence (TRS) motif, 5′-AACUAAA-3′, at the 3′ end of the leader sequence and precedes each ORF (Table 2) . This TRS has also been shown to be the TRS for HCoV-NL63 , whereas a shorter sequence, 5″-CUAAAC-3′, was found to be the TRS for other group 1 coronaviruses such as TGEV and FIPV (Dye and Siddell, 2005; Hiscox et al., 1995) . Similar to other coronaviruses, the replicase ORF1ab encodes a number of putative proteins, including nsp3 [which contains the putative papain-like protease (PL pro )], nsp5 [putative chymotrypsin-like protease (3CL pro )], nsp12 (putative RdRp), nsp13 [putative helicase (Hel)], which are produced by proteolytic cleavage by PL pro and 3CL pro at specific sites (Woo et al., 2005c) . Similar to other group 1 coronaviruses, the genome of bat-CoV HKU2 has two putative PL pro , which are homologous to PL1 pro and PL2 pro of other group 1 coronaviruses (Fig. 2) . One ORF, which encodes a putative 229-amino acid nonstructural protein, NS3, was observed between the S and E genes. This NS3 possessed 42% amino acid identities to the NS3 of HCoV-NL63, 37% identities to that of BtCoV/512/05, 36% identities to that of PEDV, and 29% identities to the NS3b of TGEV. No functional domains were identified by PFAM and InterProScan. TMHMM analysis showed three putative transmembrane domains in NS3 of bat-CoV HKU2 (residues 38-60, 81-103, and 118-140) . The most striking feature of bat-CoV HKU2 genome was observed in its S protein which possessed the shortest amino acid sequence (1128 amino acid residues) among the S proteins of all coronaviruses, as a result of deletions in the N-terminal region ( Supplementary Fig. 1 ). It had ≤ 27% amino acid identities to the S proteins of all known coronaviruses, as opposed to other genes which showed higher amino acid identities to the corresponding genes in other group 1 coronaviruses (especially group 1b) than to group 2 and group 3 coronaviruses (Table 1) . When the S protein of bat-CoV HKU2 is aligned with the S protein of other group 1 coronaviruses, many of the amino acid residues conserved among and specific to group 1b coronaviruses were not found; whereas residues conserved among all coronaviruses, especially those in the C-terminal region, were identified (Supplementary Fig. 1 ). In fact, the N-terminal region of the S protein of bat-CoV HKU2 possessed very low amino acid identities to the corresponding regions in any group of coronaviruses, which was due to both amino acid substitutions and deletions. Despite this, a short peptide consisting of 15 amino acids (residues 314 to 328) was found to be homologous to a corresponding peptide within the RBM in the S1 domain of SARS-CoV (residues 437 to 451) (Fig. 3) . A similar peptide was also observed in bat-SARS-CoV, but not in any other known coronaviruses, suggesting that it is specific to SARS-CoV, bat-SARS-CoV and bat-CoV HKU2, with a common origin. Of the 15 amino acids within this homologous peptide, six (tyrosine 438, leucine 442, glycine 445, lysine 446, proline 449, and phenylalanine 450) were conserved between SARS-CoV and bat-CoV HKU2, with four using identical codons. Of these six amino acid residues, only four (tyrosine 438, lysine 446, proline 449, and phenylalaine) were found in bat-SARS-CoV, with two using identical codons. On the other hand, four additional amino acid residues (tyrosine 439, arginine 440, arginine 443, and leucine 447), not found in bat-CoV HKU2, were conserved between SARS-CoV and bat-SARS-CoV, though with different codon usage. In contrast to a previous study which suggested that the extended receptor-binding domain of HCoV-NL63 includes a stretch of residues with weak homology to the RBM of SARS- CoV (unpublished observations, Li et al., 2006) , we and another group of researchers did not identify any significant homology between the spike protein of the two coronaviruses (Hofmann et al., 2006) . When compared to the S proteins of other group 1 coronaviruses and SARS-CoV, large deletions were observed in the S protein of bat-CoV HKU2 in the region corresponding to the RBM of SARS-CoV. Since the amino acid sequence of the S protein of bat-SARS-CoV also differed significantly from that of SARS-CoV in this region, it is likely that this is a site of frequent mutation and/or recombination among coronaviruses in Chinese horseshoe bats. This highly variable region within the S protein of bat-CoV HKU2 and bat-SARS-CoV may have been important for host receptor adaptation. Although the overall amino acid identities of the S protein of bat-CoV HKU2 were equally low when compared to the S proteins of all three groups of coronaviruses, the S protein of bat-CoV HKU2 shares the two conserved regions of deletions both of 14 amino acids among group 2 coronaviruses in its C-terminus ( Supplementary Fig. 1 ). This suggests that this segment of the S protein of bat-CoV HKU2 may have co-evolved with the corresponding regions in group 2 coronaviruses. Nevertheless, the receptor for bat-CoV HKU2 remains to be determined. Aminopeptidase N (CD13) has been shown to be the receptor for many group 1 coronaviruses including HCoV-229E, canine coronavirus, FIPV, PEDV, and TGEV (Delmas et al., 1992; Yeager et al., 1992) . As for group 2 coronaviruses, carcinoembryonic antigen-cell adhesion molecule 1 (CEACAM1) was identified as the receptor for murine hepatitis virus (MHV), while sialic acids were found to be the receptor for bovine coronavirus (BCoV) and human coronavirus OC43 (HCoV-OC43) (Krempl et al., 1995; Williams et al., 1991) . However, human ACE2 (hACE2) have been shown to be the receptor for both SARS-CoV, a group 2 coronavirus, and HCoV-NL63, a group 1 coronavirus, although the two viruses utilize different binding sites for receptor recognition (Hofmann et al., 2005; Li et al., 2003) . The S protein of bat-CoV HKU2 does not exhibit significant homology to the known receptor binding domains of HCoV-229E, HCoV-NL63, or MHV (Bonavia et al., 2003; Hofmann et al., 2006; Kubo et al., 1994) . Further experiments are required to delineate the receptor for bat-CoV HKU2. At the 3′ end of the genome after the N gene, there is one ORF that encodes a 99-amino acid nonstructural protein, NS7a. BLAST search revealed no amino acid similarities between this putative nonstructural protein and other known proteins and no functional domain was identified by PFAM and InterProScan. TMHMM analysis showed two putative transmembrane domains in NS7a (residues 4-26 and 59-81). Previously, FIPV and TGEV, both group 1a coronavirus, were the only coronaviruses known to possess genes downstream of N (Fig. 1) . It has been suggested that the two genes downstream of N in FIPV may be important for virulence (Haijema et al., 2004; Olsen, 1993) . In TGEV, the gene downstream of N has been suggested to play a role in membrane association of replication complexes or assembly of the virus (Tung et al., 1992) . In our recent report on the discovery of bat coronavirus HKU9, a novel bat coronavirus belonging to group 2d coronaviruses, two ORFs downstream to N were also found (Woo et al., 2006a) . In another group 1b coronavirus recently identified from bats in China, BtCoV/512/05, an ORF downstream to N was also identified . These suggest that ORFs downstream to N can be present in coronaviruses other than group 1a and may be more prevalent among bat coronaviruses. Further experiments will delineate the function of such ORFs in bat coronaviruses. The phylogenetic trees constructed using the amino acid sequences of the 3CL pro , RdRp, Hel, S, M, and N of bat-CoV HKU2 and other coronaviruses are shown in Fig. 4 and the corresponding pairwise amino acid identities are shown in Table 1 . As shown in all six trees, the four strains of bat-CoV HKU2 were clustered together, reflecting their high sequence similarities. For all the genes except S, bat-CoV HKU2 formed a distinct branch that clustered with other group 1 coronaviruses. This is supported by the higher amino acid identities to the corresponding genes in other group 1 coronaviruses (especially group 1b) than to those of group 2 and group 3 coronaviruses (Table 1) . However, for the S gene, bat-CoV HKU2 formed a branch distinct from the three groups of known coronaviruses. The same tree topology was obtained when using the maximum likelihood method and Bayesian approach (data not shown). This finding is in line with results obtained from pairwise amino acid comparisons, which showed that the S of bat-CoV HKU2 possessed equally low amino acid identities (≤ 27%) to the S of all three groups of coronaviruses (Table 1) . To evaluate if segments of the SARS-CoV genome have arisen as a result of recombination between bat-SARS-CoV and bat-CoV HKU2, a sliding window analysis was conducted. No statistical support for recombination was obtained, which may be due to the high sequence divergence between the bat-SARS-CoV and bat-CoV HKU2 genomes. The Ka/Ks ratios for the various coding regions in bat-CoV HKU2 are shown in Table 3 . Higher Ka/Ks ratios were observed within ORF1ab, especially nsp3 (which encodes the putative PL pro domains), nsp5 (which encodes the putative 3CL pro ), and nsp14 (which encodes the helicase), whereas the ratios appeared to be lower among the structural genes. Notably, the Ka/Ks ratio for the S of bat-CoV HKU2 is only 0.03, suggesting that this gene is unlikely undergoing rapid evolution under positive selection. In this study, bat-CoV HKU2 was found among 29 (8.3%) of 348 Chinese horseshoe bats from Hong Kong and 7 (10.9%) of 64 bats from Guangdong. All bats infected with bat-CoV HKU2 appeared healthy. The finding that bat-CoV HKU2 can only be detected in alimentary specimens suggests that it possesses enteric tropism. The genomes of the four strains of bat-CoV HKU2 being sequenced were highly similar, with conserved nucleotide and amino acid sequences in most of their genes (Fig. 4) . Traditionally, coronaviruses have been classified into groups 1, 2, and 3. Based on a comprehensive comparative analysis of the genomes of the various groups of coronaviruses, coronaviruses can be classified into group 1 (subgroups 1a and 1b), group 2 (subgroups 2a, 2b, 2c, and 2d) and group 3 (Woo et al., 2006a) , with SARS-CoV being classified as group 2b coronaviruses (Eickmann et al., 2003; Snijder et al., 2003) . Comparative amino acid sequence analysis showed that the predicted proteins in bat-CoV HKU2, except the S protein, were most similar to subgroup 1b of group 1 coronaviruses than to other groups of coronaviruses (Table 1) . Based on phylogenetic analysis of the 3CL pro , RdRp, Hel, M, and N genes, the four strains of bat-CoV HKU2 formed a distinct branch within subgroup 1b of group 1 coronaviruses. They also possessed genomic features most similar to other members within this subgroup (Fig. 2) . The genomes of group 1a coronaviruses encode two to three nonstructural proteins between S and E, whereas most group 1b coronaviruses encode only one such protein, except HCoV-229E which encodes two (Thiel et al., These results support that bat-CoV HKU2 represents a novel member within subgroup 1b of group 1 coronaviruses. The S protein of bat-CoV HKU2 possesses several unique features. First, it represents the shortest S protein among the S proteins of known coronaviruses, as a result of substantial deletions especially in the N-terminal region corresponding to the RBM of SARS-CoV. These deletions within the S protein were also largely responsible for the smallest coronavirus genome observed among all coronaviruses. Second, although comparative genome analysis strongly suggests that bat-CoV HKU2 belonged to group 1b coronaviruses, its S protein is not closely related to the S proteins of any known coronaviruses. The S proteins of coronaviruses, being responsible for receptor binding and host species adaptation, are known to be one of the most variable regions within coronavirus genomes. Nevertheless, S proteins of coronaviruses within the same group or subgroup are more closely related among themselves than to members from a different group or subgroup, as shown in the same cluster upon phylogenetic analysis (Fig. 4) . As demonstrated in a previous study, the within-group amino acid similarities of the S proteins of coronaviruses ranged from 59 to 91% while between-group similarities were from 22 to 36% . In particular, the within-group similarity of the S proteins of group 1 coronavirus was found to be 59%. In contrast, the S protein of bat-CoV HKU2 possessed ≤27% amino acid identities to the S proteins of any known coronaviruses and formed a distinct branch away from the three groups of coronaviruses on phylogenetic analysis, suggesting that this gene had a very different phylogenetic position and hence evolutionary history as compared to other regions within the genome of bat-CoV HKU2. This virus would have either acquired this unique S protein from a yet unidentified coronavirus through recombination, or undergone rapid evolution in its S protein because of strong selective pressure. Since the Ka/Ks ratio for the S gene of bat-CoV HKU2 was found to be low when using the four strains collected from different sites and dates (Table 3) , the latter hypothesis would be less supported. Moreover, further analysis revealed a unique short peptide with significant homology to a corresponding peptide within the RBM of SARS-CoV, which was not seen in any other coronaviruses except bat-SARS-CoV. The C-terminus of the S protein of bat-CoV HKU2 also contained regions of deletions conserved among group 2 coronaviruses. Therefore, the S protein of bat-CoV HKU2 is likely to share a common origin with other group 2 coronaviruses, especially group 2b coronaviruses, although bat-CoV HKU2 belongs to group 1 coronaviruses. This suggests that the S of bat-CoV HKU2 could have been acquired from a group 2 or related coronavirus by recombination. Although recombination between different groups of coronaviruses has not been reported previously, targeted recombination between MHV and it has been proposed that recombination may have occurred between influenza C virus and coronavirus (Luytjes et al., 1988) . Since the hemagglutinin esterase (HE), a unique protein only found in group 2 but not in group 1 or 3 coronaviruses, shared 30% amino acid homology to the hemagglutinin protein of influenza C virus, it was suggested that the HE of group 2 coronaviruses could have been acquired from influenza C virus by their ancestor through recombination. The present data suggest that the S protein of bat-CoV HKU2, bat-SARS-CoV, and SARS-CoV could have originated from an unknown ancestor coronavirus and was thereafter separately evolved, with the 15-amino acid homologous region being left-in molecular signatures. Further studies are required to elucidate the possible common ancestor virus and its host species. Although it remains to be determined if bats are reservoir for the direct precursor of SARS-CoV, Chinese horseshoe bats are a potential mixing vessel for the generation of new coronavirus variants. Apart from bat-CoV HKU2, bat-SARS-CoV was also found among 29 (8.3%) Chinese horseshoe bats from Hong Kong and 2 (3.1%) bats from Guangdong in the present study. Coinfection by both bat-CoV HKU2 and bat-SARS-CoV was also found in one bat from China. In our previous study, bat-CoV HKU2 was also detected in a bat positive for antibodies against bat-SARS-CoV . Recombination, a characteristic feature of coronaviruses, has been observed between both different strains of the same coronavirus species and different species of coronaviruses. Recombination between different strains of coronaviruses was first recognized in MHV, which has been utilized as a valuable molecular tool in the generation of mutants by targeted RNA recombination (Keck et al., 1988) . Similar phenomenon was subsequently demonstrated in other coronaviruses such as infectious bronchitis virus, a group 3 coronavirus and between MHV and BCoV, both being group 2 coronaviruses (Kottier et al., 1995; Lavi et al., 1998) . Recently, by complete genome analysis of 22 strains of CoV-HKU1, we have also documented for the first time natural recombination events in a human coronavirus giving rise to at least three different genotypes (Woo et al., 2006c) . Recombination between two different species of coronavirus, feline coronavirus type I and canine coronavirus, has also been suggested to be responsible for generation of feline coronavirus type II (Herrewegh et al., 1998) . Although the existing data did not provide enough evidence for recombination between bat-CoV HKU2 and bat-SARS-CoV in the generation of SARS-CoV, their co-infection of the same bat species would allow ample opportunities for recombination and emergence of other SARS-CoV-like viruses capable of interspecies transmission. The role of bats in the evolution and ecology of coronaviruses is yet to be explored. The existence of coronaviruses in bats was unknown until after the SARS epidemic when we first identified a novel group 1 coronavirus and bat-SARS-CoV from bats in Hong Kong Poon et al., 2005) . An astonishing diversity of coronaviruses was subsequently found among the bat population in Hong Kong and other parts of China (Li et al., 2005b; Tang et al., 2006; Woo et al., 2006a Woo et al., , 2006d . Since bats are commonly found and served in wild animal markets and restaurants in Guangdong (Woo et al., 2006b) , and given their species diversity, roosting behavior, and migrating ability, these animals could well be the source for emergence of zoonotic epidemics like SARS. In a previous study, it has been suggested that there was species-specific host restriction of coronavirus in bats, with most coronaviruses from a single bat species clustered together . However, there is evidence that one bat species can be infected by more than one coronavirus species, and more than one bat species can be infected by the same coronavirus. The consistent detection of bat-CoV-HKU2 and bat-SARS-CoV in Chinese horseshoe bats over the 2-year study period from both Hong Kong and Guangdong suggested that this bat species is an established reservoir for both viruses which belonged to two different groups. Chinese horseshoe bat, under the family Rhinolophidae, is a common insectivorous species found in Hong Kong and China. Apart from Rhinolophus sinicus, R. ferrumequinum, another horseshoe bat species found in China, has also been found to harbor both group 1 and group 2 coronaviruses . Therefore, it is likely that bats, especially members of Rhinolophidae, can be infected by both group 1 and group 2 coronaviruses, a situation similar to humans who can be infected by group 1 (HCoV-229E and HCoV-NL63) and group 2 (SARS-CoV, HCoV-OC43, and CoV-HKU1) coronaviruses. As for the infection of more than one bat species by the same coronavirus, SARS-CoV-like viruses have been detected in at least three different species of Rhinolophidae in China (Li et al., 2005b) . More extensive surveillance for coronaviruses in different species of horseshoe bats would shed light on the role of this bat family in the ecology and evolution of coronaviruses. Chinese horseshoe bats (R. sinicus) were captured from various locations in Hong Kong and in the Guangdong province of Southern China over a 2-year period (April 2004 to April 2006 . Their respiratory and alimentary specimens were collected using procedures described previously Yob et al., 2001) . All specimens were placed in viral transport medium before transportation to the laboratory for RNA extraction. Viral RNA was extracted from the respiratory and alimentary specimens using QIAamp Viral RNA Mini Kit (QIAgen, Hilden, Germany). The RNA was eluted in 50 μl of AVE buffer and was used as the template for RT-PCR. Coronavirus screening was performed by amplifying a 440bp fragment of the RNA-dependent RNA polymerase (RdRp) gene of coronaviruses using conserved primers (5′-GGTTGGG-ACTATCCTAAGTGTGA-3′ and 5′-CCATCATCAGATAGA-ATCATCATA-3′) designed by multiple alignments of the nucleotide sequences of available RdRp genes of known coronaviruses (Woo et al., 2005a) . Reverse transcription was performed using the SuperScript III kit (Invitrogen, San Diego, CA, USA). The PCR mixture (25 μl) contained cDNA, PCR buffer (10 mM Tris-HCl pH 8.3, 50 mM KCl, 3 mM MgCl 2 , and 0.01% gelatin), 200 μM of each dNTPs, and 1.0 U Taq polymerase (Applied Biosystem, Foster City, CA, USA). The mixtures were amplified in 60 cycles of 94°C for 1 min, 48°C for 1 min, and 72°C for 1 min and a final extension at 72°C for 10 min in an automated thermal cycler (Applied Biosystem, Foster City, CA, USA). Standard precautions were taken to avoid PCR contamination and no false-positive was observed in negative controls. The PCR products were gel-purified using the QIAquick gel extraction kit (QIAgen, Hilden, Germany). Both strands of the PCR products were sequenced twice with an ABI Prism 3700 DNA Analyzer (Applied Biosystems, Foster City, CA, USA), using the two PCR primers. The sequences of the PCR products were compared with known sequences of the RdRp genes of coronaviruses in the GenBank database. Three of the samples positive for bat-CoV HKU2 were cultured in LLC-Mk2 (rhesus monkey kidney), MRC-5 (human lung fibroblast), FRhK-4 (rhesus monkey kidney), Huh-7.5 (human hepatoma), Vero E6 (African green monkey kidney), HRT-18 (colorectal adenocarcinoma) cell lines and primary kidney epithelium and lung fibroblast cells derived from a Chinese horseshoe bat. Four complete genomes of bat-CoV HKU2 detected in the present study were amplified and sequenced using the RNA extracted from the alimentary specimens as templates. The RNA was converted to cDNA by a combined random-priming and oligo(dT) priming strategy. As the initial results revealed that they were group 1 coronaviruses, the cDNA was amplified by degenerate primers designed by multiple alignment of the genomes of human coronavirus 229E (HCoV-229E) (GenBank accession no. NC_002645), porcine epidemic diarrhea virus (PEDV) (GenBank accession no. NC_003436), porcine transmissible gastroenteritis virus (TGEV) (GenBank accession no. NC_002306), feline infectious peritonitis virus (FIPV) (Gen-Bank accession no. AY994055), and HCoV-NL63 (GenBank accession no. NC_005831), and additional primers covering the original degenerate primer sites were designed from the results of the first and subsequent rounds of sequencing. These primer sequences are available on request. The 5′ ends of the viral genomes were confirmed by rapid amplification of cDNA ends using the 5′/3′ RACE kit (Roche, Germany). Sequences were assembled and manually edited to produce final sequences of the viral genomes. The nucleotide sequences of the genomes and the deduced amino acid sequences of the open reading frames (ORFs) were compared to those of other coronaviruses. Phylogenetic tree construction was performed using neighbor joining method with ClustalX 1.83. Protein family analysis was performed using PFAM and InterProScan (Apweiler et al., 2001; Bateman et al., 2002) . Prediction of transmembrane domains was performed using TMHMM (Sonnhammer et al., 1998) . The number of synonymous substitutions per synonymous site, Ks, and the number of non-synonymous substitutions per non-synonymous site, Ka, for each coding region were calculated using the Nei-Gojobori method (Jukes-Cantor) in MEGA 3.1 (Kumar et al., 2004) . Six pairwise comparisons on the four strains of bat-CoV HKU2 were performed for each coding region. Sliding window analysis was used to detect possible recombination, using a nucleotide alignment of the genome sequences of the four strains of bat-CoV HKU2 and bat-SARS-CoV (GenBank accession no. DQ022305) generated by ClustalX version 1.83 and edited manually. Bootscan analysis was performed using Simplot version 3.5.1 (Lole et al., 1999) (F84 model; window size, 1000 bp; step, 200 bp) with the genome sequence of SARS-CoV (GenBank accession no. NC_004718) as a query. The nucleotide sequences of the four genomes of bat-CoV HKU2 have been lodged within the GenBank sequence database under accession no. EF203064 to EF203067. The InterPro database, an integrated documentation resource for protein families, domains and functional sites The Pfam protein families database Identification of a receptor-binding domain of the spike glycoprotein of human coronavirus HCoV-229E Coronavirus genome structure and replication Aminopeptidase N is a major receptor for the enteropathogenic coronavirus TGEV Genomic RNA sequence of Feline coronavirus strain FIPV WSU-79/1146 A previously undescribed coronavirus associated with respiratory disease in humans Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China Live, attenuated coronavirus vaccines through the directed deletion of group-specific genes provide protection against feline infectious peritonitis Feline coronavirus type II strains 79-1683 and 79-1146 originate from a double recombination between feline coronavirus type I and canine coronavirus Investigation of the control of coronavirus subgenomic mRNA transcription by using T7-generated negative-sense RNA transcripts Human coronavirus NL63 employs the severe acute respiratory syndrome coronavirus receptor for cellular entry Highly conserved regions within the spike proteins of human coronaviruses 229E and NL63 determine recognition of their respective cellular receptors In vivo RNA-RNA recombination of coronavirus in mouse brain Experimental evidence of recombination in coronavirus infectious bronchitis virus Analysis of cellular receptors for human coronavirus OC43 Localization of neutralizing epitopes and the receptor-binding site within the amino-terminal 330 amino acids of the murine coronavirus spike protein MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment The molecular biology of coronaviruses Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats Coronavirus HKU1 and other coronavirus infections in Hong Kong The pathogenesis of MHV nucleocapsid gene chimeric viruses Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Receptor and viral determinants of SARScoronavirus adaptation to human ACE2 Bats are natural reservoirs of SARSlike coronaviruses Animal origins of the severe acute respiratory syndrome coronavirus: insight from ACE2-S-protein interactions Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination Sequence of mouse hepatitis virus A59 mRNA 2: indications for RNA recombination between coronaviruses and influenza C virus A review of feline infectious peritonitis virus: molecular biology, immunopathogenesis, clinical aspects, and vaccination Coronavirus as a possible cause of severe acute respiratory syndrome Identification of a novel coronavirus in bats Genome structure and transcriptional regulation of human coronavirus NL63 Full-length genome sequences of two SARS-like coronaviruses in horseshoe bats and genetic variation analysis Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage Cross-host evolution of severe acute respiratory syndrome coronavirus in palm civet and human A hidden Markov model for predicting transmembrane helices in protein sequences Prevalence and genetic diversity of coronaviruses in bats from Infectious RNA transcribed in vitro from a cDNA copy of the human coronavirus genome cloned in vaccinia virus The 9-kDa hydrophobic protein encoded at the 3′ end of the porcine transmissible gastroenteritis coronavirus genome is membraneassociated Identification of a new human coronavirus Receptor for mouse hepatitis virus is a member of the carcinoembryonic antigen family of glycoproteins Relative rates of non-pneumonic SARS coronavirus infection and SARS coronavirus pneumonia Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia Clinical and molecular epidemiological features of coronavirus HKU1-associated community-acquired pneumonia In silico analysis of ORF1ab in coronavirus HKU1 genome reveals a unique putative cleavage site of coronavirus HKU1 3C-like protease Comparative analysis of 12 genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features Infectious diseases emerging from Chinese wet-markets: zoonotic origins of severe respiratory viral infections Comparative analysis of 22 coronavirus HKU1 genomes reveals a novel genotype and evidence of natural recombination in coronavirus HKU1 Molecular diversity of coronaviruses in bats Evasion of antibody neutralization in emerging severe acute respiratory syndrome coronaviruses Human aminopeptidase N is a receptor for human coronavirus 229E Nipah virus infection in bats (order Chiroptera) in peninsular Malaysia Molecular biology of severe acute respiratory syndrome coronavirus We thank Director Stella Hung, Sin-Pang Lau Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.virol.2007.06.009.