key: cord-0904030-6lobyyj4
authors: Anthony, Simon J.; Epstein, Jonathan H.; Murray, Kris A.; Navarrete-Macias, Isamara; Zambrana-Torrelio, Carlos M.; Solovyov, Alexander; Ojeda-Flores, Rafael; Arrigo, Nicole C.; Islam, Ariful; Ali Khan, Shahneaz; Hosseini, Parviez; Bogich, Tiffany L.; Olival, Kevin J.; Sanchez-Leon, Maria D.; Karesh, William B.; Goldstein, Tracey; Luby, Stephen P.; Morse, Stephen S.; Mazet, Jonna A. K.; Daszak, Peter; Lipkin, W. Ian
title: A Strategy To Estimate Unknown Viral Diversity in Mammals
date: 2013-09-03
journal: mBio
DOI: 10.1128/mbio.00598-13
sha: 634128ea7d7736750e1c3cd0a48bb37843d06dac
doc_id: 904030
cord_uid: 6lobyyj4

The majority of emerging zoonoses originate in wildlife, and many are caused by viruses. However, there are no rigorous estimates of total viral diversity (here termed “virodiversity”) for any wildlife species, despite the utility of this to future surveillance and control of emerging zoonoses. In this case study, we repeatedly sampled a mammalian wildlife host known to harbor emerging zoonotic pathogens (the Indian Flying Fox, Pteropus giganteus) and used PCR with degenerate viral family-level primers to discover and analyze the occurrence patterns of 55 viruses from nine viral families. We then adapted statistical techniques used to estimate biodiversity in vertebrates and plants and estimated the total viral richness of these nine families in P. giganteus to be 58 viruses. Our analyses demonstrate proof-of-concept of a strategy for estimating viral richness and provide the first statistically supported estimate of the number of undiscovered viruses in a mammalian host. We used a simple extrapolation to estimate that there are a minimum of 320,000 mammalian viruses awaiting discovery within these nine families, assuming all species harbor a similar number of viruses, with minimal turnover between host species. We estimate the cost of discovering these viruses to be ~$6.3 billion (or ~$1.4 billion for 85% of the total diversity), which if annualized over a 10-year study time frame would represent a small fraction of the cost of many pandemic zoonoses.

A total of 12,793 consensus PCR assays were performed for the detection of viruses from nine different families/ genera, including coronaviruses (CoVs; n ϭ 1,631), paramyxoviruses (PMVs; n ϭ 1,108), hantaviruses (HTVs; n ϭ 1,108), astroviruses (AstVs; n ϭ 1,348), influenza A viruses (IFAVs; n ϭ 1,108), bocaviruses (BoVs; n ϭ 1,739), adenoviruses (AdVs; n ϭ 1,902), herpesviruses (HVs; n ϭ 1,741), and polyomaviruses (PyVs; n ϭ 1,108) ( Table 1) . None of the samples were positive for IFAVs or HTVs, despite previous studies documenting their presence in other bat species (6) (7) (8) ; however, a total of 985 viral sequences representing the other seven viral families were detected in these bats. These sequences were segregated into 55 discrete viruses based on distinct monophyletic clustering (see Materials and Methods) ( Table 1) , and a virus was considered novel if the sequence identity to its closest relative was less than or equal to the identity between the two closest species for a given viral family (9) .

Eleven PMVs were detected, including 10 novel viruses (PMV-1 from P. giganteus to and Nipah virus (PgPMV-11). These PMVs exhibited high sequence variation and clustered phylogenetically with either the rubulaviruses or an unassigned group related to the henipaviruses (Fig. 1 ). Within the AdV family, 14 viruses were discovered (PgAdV-1 to -14) . Thirteen were novel mastadenoviruses, while one virus (PgAdV-2) had 98% nucleotide identity to the aviadenovirus Fowl adenovirus E (Fig. 2) . Eight different AstVs were found (PgAstV-1 to -8), all of which were novel and clustered within the genus Mamastrovirus (Fig. 3) . Within the CoV family, four distinct viruses were discovered. The first two were closely related betacoronaviruses (PgCoV-1 and -2). The third was also a betacoronavirus (PgCoV-3) but was more distantly related and showed 97% nucleotide identity to bovine and human coronaviruses (human strains 4408 and OC43). The fourth CoV was a gammacoronavirus (PgCoV-4) with 91% nucleotide identity (97% at the amino acid level) to the avian Infectious bronchitis virus (Fig. 4) . Three novel PyVs were identified (PgPyV-1 to -3), all of which clustered with viruses in the genus Orthopolyomavirus (Fig. 5) . A total of 639 HV sequences were detected, which segregated into 13 distinct clades (PgHV 1 to 13) using hierarchical clustering (see Materials and Methods). None could be reliably classified within any existing genus, and they likely represent new groups within the Betaherpesvirinae and Gammaherpesvirinae subfamilies (Fig. 6) . One virus, PgHV-11, appears to be a recombinant between PgHV-10 and PgHV-13, with a breakpoint evident at approximately nucleotide 90. Upstream from this breakpoint, the sequences for PgHV-11 are related to PgHV-10, while downstream from the breakpoint, they are related to PgHV-13. Finally, two different BoVs were discovered (PgBoV-1 and -2), both of which showed Ͼ98% nucleotide identity to known human BoVs (Fig. 7) .

Viral discovery curves and estimates of viral richness. Asymptotic viral richness was estimated from observed detections using three statistical models, Chao2, ICE, and Jackknife (10) . To ensure internal consistency, only those samples screened for the full complement of nine viral families/genera were included (n ϭ 1,092), which accounted for 44/55 viruses identified in this study. The relative frequency of these viruses is presented in Fig. S1 in the supplemental material. Of the 1,092 samples included, 766 were negative for all viruses. There were 595 viral detections from 326 positive samples, with 167 samples containing Ͼ1 virus. When all 44 viruses were considered, the accumulative discovery curve began to show signs of saturation (Fig. 8) . The Chao2 estimator demonstrated asymptotic behavior as early as 500 samples (Ͻ50% of tested samples) and was highly stable (low variance) by 1,000 samples (Fig. 8) . The total viral richness within our sample population was estimated to be 58 viruses (limited to viral families tested for), and the required sampling effort to discover all 58 was estimated to be 7,079 samples. Asymptotic estimates of viral richness were also calculated individually for PMVs, AstVs, HVs, and AdVs (see Fig. S2 in the supplemental material). For HVs, AstVs, and PMVs, estimates were stable or stabilizing and showed that most of the predicted viral diversity had been identified. For AdVs, none of the estimators stabilized, probably due to a high singleton-to-doubleton ratio (i.e., the rate of discovery of new viruses was still high). Individual assessments were not performed for CoVs, PyVs, or BoVs because a lack of single or double detections prevented meaningful estimates (10) .

Coinfection. Positive PCR products were cloned and sequenced to look for the presence of cooccurring viruses. A total of 276 samples contained Ͼ1 virus, including urine (n ϭ 56), throat swabs (n ϭ 199), and roost urine (n ϭ 56). Between 2 and 5 viruses were found to coexist, and both intrafamilial (n ϭ 223/276) and interfamilial (n ϭ 93/276) viral family cooccurrences were observed ( Fig. 9 ). Intraspecific codetections were limited to the families Herpesviridae and Adenoviridae ( Fig. 9 and see Table S1 in the supplemental material). The patterns of HV cooccurrence were significantly nonrandom (P ϭ Ͻ0.001 with C score [11, 12] ), and positive pairwise associations were observed between PgHV-13 and PgHV-10 and between PgHV-11 and PgHV-10 (P ϭ Ͻ0.001).

A negative association was also observed between PgHV-13 and -12, where the observed frequency of cooccurrence was below what would be expected by chance, given the prevalence of both viruses (P ϭ Ͻ0.001). The patterns of cooccurrence for different adenoviruses were less structured, and no robustly supported positive or negative associations were observed. The same is true for analyses of interfamilial codetections.

In this study, we combined virological and ecological techniques to describe the virodiversity of a known zoonotic reservoir, P. giganteus, including the first-ever estimate of viral richness and the sampling effort required to detect any proportion of it. The article also includes a phylogenetic description of 55 viruses identified in this species and an ecological description of the positive and negative occurrences between them. This unprecedented description of the potential zoonotic pool not only establishes a framework for comparison of virodiversity among host populations in different geographic regions and ecological settings but also provides important a priori knowledge should a novel pathogen emerge or have emerged. For example, sequence data from this study are already being used to develop serological assays to test people with a range of symptoms and known high risk of exposure to bats. This will allow us to identify cases of bat-to-human viral spillover and, potentially, their health consequences.

Estimate of total viral diversity (richness) in P. giganteus in Bangladesh. Previous attempts have been made to predict the total number of unknown viruses in humans by analyzing temporal trends in viral discovery (13, 14) . However, these assessments principally consider the emergence of disease-causing agents rather than an ecological assessment of virodiversity. Here, we ® measured viral richness using nonparametric species discovery curves, which are commonly used in biodiversity studies (10, 13, (15) (16) (17) and rely on the frequency of rarely occurring species to measure the completeness of discovery (15) and make statistical estimations of the undiscovered fraction (10) . Our estimate of the viral carrying capacity of P. giganteus as 58 viruses (for the nine viral families assayed) was robustly supported, with the cumulative number of new viruses slowing toward an asymptotic trajectory and with statistical estimators showing reliably asymptotic behavior. Given that our total discovery effort revealed 55 of the estimated 58 viruses, we suggest that most have now been identified and that the remaining viruses in P. giganteus in Bangladesh are extremely rare. However, we make the qualification that this estimation of richness is likely a minimum and that additional viruses will almost certainly be found through the expansion of viral family testing and the use of high-throughput sequencing. While we cannot infer a great deal about the biology or taxonomic relatedness of the undiscovered diversity, our estimate of viral richness does allow useful considerations of the efficacy of viral discovery efforts in this species. In general, surveys for undiscovered diversity present diminishing returns, since the commonly occurring species are quickly identified, while the rare species require an increasingly large sampling effort. Here, the Chao 2 estimator predicted that 85% of the total richness could be achieved if another~500 samples were screened but that a further 5,500 would then be required to find the remaining 15%. Considering that only very rare viruses would exist within this final fraction and assuming that this rarity reduces both exposure and the chance of transmission, the public health advantage gained by knowledge of their biological properties or taxonomy may not be sufficiently high to justify the cost of their discovery. Similar estimations were made for each of the viral families individually. For example, 13 HVs were discovered in a total of 1,741 samples, with an estimated 8,503 additional samples needed to identify just 2 more predicted viruses (100% of the estimated). Equally, 11 PMVs were identified in 1,108 samples, with one additional virus projected from a further 773 samples. In both cases, an extremely limited return is expected from a costly discovery effort.

Estimates of unknown viral diversity (richness) in all mammals and the cost of discovery. Mammals are the reservoir hosts of the majority of emerging zoonoses (2, 3, 18) . If we assume that all 5,486 described mammalian species (19) harbor an average of 58 viruses in the nine families of interest (as estimated here in P. giganteus) and that these viruses exhibit 100% host specificity, the total richness of mammalian viruses awaiting discovery ex-ceeds~320,000. We used the data on expenditures for surveillance and pathogen discovery in this study to calculate the direct cost of discovering all 58 viruses in P. giganteus (see the supplemental material for details of this cost analysis). We estimate this cost to be $1.2 million, including collection and laboratory testing of 7,079 samples. Assuming expenditures to be equal for all host species, the cost of sampling and viral discovery for all mammalian viruses would be approximately $6.3 billion. Accounting for diminishing returns means that discovering 85% of the estimated diversity would be disproportionately cheaper at approximately $1.4 billion. Our estimates of virodiversity and cost of discovery are preliminary; however, we include them to demonstrate (i) how a systematic estimation of total viral diversity could be used to inform better surveillance through strategic resource allocation and (ii) that, given our cost estimates, the discovery of the majority of potential zoonotic viruses is not an unattainable goal over the next few decades. The generation of sequence data will not, of course, in itself prevent pandemics. However, it does provide data that refine our knowledge of the functional relationship between host and viral diversity, including traits associated with increased risk of spillover and subsequent emergence (e.g., viruses closely related to and sharing receptor binding domains with known lethal agents [20] ) and, also, facilitates the development of rapid diagnostic tests for intervention and control. Several important limitations must be considered in our extrapolations, including (i) the assumption that a mean of 58 viruses per species is a reasonable estimate and that host populations are panmictic with respect to viral transmission (such that expanded geographic sampling would not influence viral detec-tions), (ii) the assumption that viruses are not shared by more than one host species, (iii) that only those viruses within the nine families are considered in this estimation, (iv) that the results are limited by the sensitivity and specificity of our tests, and (v) that a similar mean cost of sample collection is incurred across all species. Clearly, many of these limitations and assumptions require additional exploration. For example, while including more viral families in our survey would increase the viral richness estimate, accounting for species turnover (viral sharing between species) would reduce it. Also, while the cost of sample collection in Bangladesh is relatively low because of logistical simplicities, in some regions (e.g., tropical montane forests of Africa and Southeast Asia), the cost of transportation is much higher. Better estimates of the total number of viruses in mammals (and the cost of their discovery) will be achieved iteratively as other hosts are more extensively sampled and tested, additional viral families are included, and the limits of viral detection increase.

Novel viruses. The current study significantly enhances our knowledge of the viruses harbored by P. giganteus, for which only two viruses had been previously described, Nipah virus and a GB virus-like flavivirus (21, 22) . A total of 50/55 of the viruses discovered in this study were considered novel, while 5/55 have been reported previously (PgBoV-1 and -2, PgCoV-3 and -4, and PgPMV-11). Additional discussion of the 50 novel viruses is provided in the supplemental material. Here, we discuss a number of important limitations that must be considered in the interpreta- tion of these results. First, the use of consensus PCR limits surveillance and discovery to viruses related to those targeted in these assays. Second, variations in virus concentration can also influence the probability of detection. Third, we evaluated the diversity of viruses in a limited set of compartments and tissues, and unbiased sequencing was not used as a secondary method of capturing diversity. The classification of viruses is also significant, as redefining the genetic limits between one virus and another would change the total number and prevalence of viruses discovered and would impact our estimations of viral richness. We have tried to address this by using monophyletic clades as a taxonomic surrogate, which obviates the variable and polythetic criteria set by the ICTV for species demarcations.

Coinfection. The identification of coexisting microbes is important to a description of virodiversity because of the positive and negative associations that can occur between them (23-28). Here, we report a large number of intra-and interfamilial cooccurrences in P. giganteus and show that as many as five different The number of samples that tested positive for each respective virus in urine (U) and throat (T) is indicated in parentheses. *, published bat PyV sequences. Viruses detected in this study are identified with the prefix Pg (Pteropus giganteus) and were assigned accession numbers KC692400 to KC692402.

viruses can exist in a single sample. Not only does this reveal information about the carrying capacity and composition of discrete viral niches within an individual bat, it also demonstrates the number of different viruses that could potentially spill over to a new host from a single exposure event.

The most common intrafamilial codetections were observed within the subfamily Herpesviridae, supporting previous studies demonstrating coinfection of HVs in bats (29) . Statistically supported associations were observed between PgHV-10, -11, and -13, which phylogenetically cluster within a presumptive new genus of the betaherpesvirus subfamily. It is not known whether these detections represent coinfection of the same cell or a group of viral variants with segregated cell tropism. It is also unknown why these viruses should so readily coexist, though ecological mechanisms such as simultaneous transmission (codispersal), the availability of requisite resources, and/or shared benefits associated with host immunomodulation by one or more of these vi-ruses may explain the observed cooccurrence. Recombination is also a possible consequence of coinfection and is a common feature in the ecology and evolution of herpesviruses (30) (31) (32) (33) (34) (35) (36) .

PgHV-11 was identified as a recombinant lineage derived from the strongly associated PgHV-10 and PgHV-13, and all three viruses were detected in the same sample or compartment (throat) multiple times, suggesting that true coinfection does occur, albeit with unknown frequency. A negative association was also observed between PgHV-12 and -13, suggesting that mechanisms might also exist to reduce cooccurrence. These two viruses are very closely related, and we speculate that cooccurrence may offer little benefit to the viral population because of increased competition for resources coupled with minimal potential for fitness gains via recombination. Even though previous studies showed a lack of immune recognition in betaherpesviruses (37), we suggest this might act as an effective mechanism for reducing the coexistence of closely related viruses by preventing sequential infections. Such a mechanism would not completely preclude cooccurrence due to codispersal and would therefore serve to explain why some codetections were still observed between these two viruses. Viral spillover. Our discovery efforts revealed five viruses that appear to represent spillover events. These included two human bocaviruses (PgBoV-1 and -2), an avian adenovirus (PgAdV-2), a human/bovine betacoronavirus (PgCoV-3), and an avian gammacoronavirus (PgCoV-4). In each case, these viruses were only observed once and showed strong phylogenetic association to viruses found in humans, birds, or ruminants. The interface by which these viruses were able to move from these disparate hosts into bats is unclear. However, on several occasions, we have observed P. giganteus in Bangladesh drinking from bodies of water (rivers and ponds) that are used by people, livestock, domestic animals, and wildlife for drinking, bathing, and in some cases, sewage, and we hypothesize that shared water sources may be a source of exposure. Viral spillover (and/or host switching) is an example for which the concept of virodiversity in defined animal host populations might be particularly important. Such processes precede many emergence events (3, 38) ; however, there is almost certainly additional asymptomatic movement of viruses between hosts, the frequency and impact of which remain poorly understood.

An additional consideration is that any of the 55 viruses found in P. giganteus may have already spilled over into the human population. Annual outbreaks of Nipah virus in Bangladesh demonstrate that human exposure to viruses from these bats persists (39) (40) (41) (42) (43) (44) , and there are a significant number of undiagnosed morbidities and mortalities in this region that may well have resulted from the spillover of one of these other viruses. Subclinical movement is equally possible, as demonstrated with Tioman virus in Malaysia (45, 46) , and investigating these spillover events may help to refine our understanding of disease emergence in novel hosts.

Conclusions. Our work illustrates the power of using ecological approaches to characterize virodiversity and estimate viral richness and can be considered part of a strategy to better target surveillance to identify agents that pose zoonotic risks before they emerge in people (3). The projected $1.4 billion cost of discovering 85% of the estimated diversity is far less than the economic impact of even a single pandemic like SARS, which has been estimated at $16 Billion (47) . If annualized over a 10-year period, the discovery of 85% of mammalian viral diversity would be just $140 million/year, which is both a one-off cost and a fraction of the cost of globally coordinated pandemic control programs such as the "One World, One Health" program, estimated at $1.9 to 3.4 billion per year, recurring (64) . While these programs will not them- selves prevent the emergence of new zoonotic viruses, they will further contribute to pandemic preparedness by enhancing our understanding of viral ecology and the mechanisms of disease emergence and by providing sequences and other insights that reduce the morbidity, mortality, and economic impact of emerging infectious diseases by expediting recognition and intervention.

Samples and PCR screening. Samples (n ϭ 1,897) were collected from apparently healthy P. giganteus bats throughout Bangladesh between 2006 and 2010, as described previously (48) . This included urine (n ϭ 926), throat swabs (n ϭ 806), feces (n ϭ 78), and roost urine (n ϭ 97). All samples were collected by trained veterinarians, and all animals were released unharmed. Samples were collected directly into lysis buffer (bio-Mérieux, Inc.) and stored at Ϫ80°C until transfer to the Center for Infection and Immunity at Columbia University. Roost urine samples were also obtained by suspending 3-by 2-m polyethylene sheets underneath roosting colonies, which collected urine (with possible fecal contamination) from the bats roosting above. Total nucleic acid was extracted from all samples using the EasyMag (bioMérieux, Inc.) platform, and cDNA synthesis performed using SuperScript III first-strand synthesis supermix (Invitrogen), all according to the manufacturer's instructions. Viral discovery was performed using broadly reactive consensus PCR assays targeting coronaviruses (49) , paramyxoviruses (50) , astroviruses (51), influenza A viruses (38) , adenoviruses (52) , polyomaviruses (53), bocaviruses (54) , and herpesviruses (55) . Consensus primers for hantaviruses were modified from an existing protocol (56) in order to increase the degeneracy of the assay, and the assay validated for its ability to detect diverse hantaviruses, including Andes, Puumala, Sin Nombre, Prospect Hill, Seoul, and Thottapalayam hantaviruses. The modified primer sequences were UHantaF1 (GWGGVCARACWGCHGAYT) and UHantaR1 (CCW GGTGTDADYTCHTCWGC) (expected amplicon, 250 bp), and the annealing temperature was 52°C. All PCR products of the expected size were cloned into Strataclone PCR cloning vector, and 12 white colonies sequenced using standard M13R primers.

Trace sequences were analyzed and edited using Geneious (version 6.0.3). Sequences were aligned with ClustalW and MUSCLE, and phylogenetic trees constructed with neighbor-joining (p-distance, pairwise deletion, 1,000 bootstraps), maximum-likelihood (1,000 bootstraps), and Bayesian (Mr Bayes) algorithms. Models of evolution were selected using ModelTest, and a tree representing a consensus of the different methods is presented. Sequence identity (p-distance, pairwise deletion) was calculated in Mega 5.

Virus classification. For the purposes of this study, we avoided the use of taxonomic concepts such as species or genotype because of the variable criteria used for such distinctions (9) and because the degree of sequence conservation used to establish such distinctions can vary across the genome and may be affected by the relatively short sequence fragments generated in this study. We focused instead on collections of viral sequences that form distinct monophyletic clades within a particular family, and we considered a virus novel if the sequence identity to its closest relative is less than or equal to the identity between the two closest species for a given viral family. Due to the very large number of herpesvirus sequences identified in this study (n ϭ 650), we used hierarchical clustering to segregate sequences for this particular family. To do this, we first extracted 598 polymerase sequences from published complete genomes (downloaded from NCBI on 14 September 2012) and combined them with the sequences generated in this study (total of 1,248 sequences). Coding sequences were translated and aligned using MUSCLE (version 3.8.31) (57) with the default settings. The nucleotide alignment was constructed by replacing each amino acid with the codon that gave rise to it. Columns containing gaps in more than 1,000 of the 1,248 sequences were removed. The genetic distance between HV species was subsequently established using the published sequences in the alignment only, as de-scribed previously (58) , and a Ͼ7% nucleotide difference (Hamming distance) was used to define HV clusters. PgHV sequences were then segregated using hierarchical clustering, as implemented in the SciPy package (59) using average linkage clustering.

Virus richness and sample estimation. We implemented models from the biodiversity literature that utilize incidence distributions to estimate virus richness (number of unique viruses) and, hence, to estimate the number of undetected viruses in the assemblage (60, 61) . Incidence data result where each virus detected in the assemblage is noted in each sample as either present (verified detection) or absent (not detected, which could result due to the virus being absent or being present but not detected by the test, i.e., false absence).

From our samples, we first constructed virus accumulation and rarefaction curves for visualization. The asymptote of the rarefaction curve provides the estimate of the number of viruses that characterizes the assemblage. However, sampling to reach this asymptote is impractical, as the number of samples required may be prohibitively large (61) . We thus used statistical methods to estimate the asymptote from the data at hand.

We used the nonparametric asymptotic estimator, Chao2 (15, 10) , and also calculated ICE and Jackknife statistics for comparison. Unlike conventional curve-fitting procedures, the nonparametric estimators make no assumptions of an underlying abundance distribution, do not require ad hoc or a priori model fitting, are relatively robust to spatial autocorrelation and scale, and frequently outperform other methods of richness estimation (61) . They rely on the principle that the frequencies of the rarest species in a set of samples can be used to estimate the frequencies of undetected species and provide a minimum richness estimate.

All analyses were conducted with the fossil package (62) implemented in R (63). We followed Chao et al. (10) to calculate how many additional samples would be required to detect any proportion (including 100%) of the asymptotic virus richness. All statistics were incorporated into a single plot.

Cooccurrence. Patterns of association/disassociation were explored with the Fortran software program PAIRS (11) , utilizing the C score statistic as our measure of species cooccurrence. PAIRS implements a Bayesian approach (Bayes M criterion) to detect nonrandom associations between pairs of species (12) .

Assumptions and caveats We considered the detection and discovery of viruses akin to the problem of detection and discovery of biodiversity, as is frequently the goal of ecological studies. The basic mechanism of species detection occurs from drawing samples by collection from some larger assemblage (61) . In this context, our samples are as described above, urine, throat, fecal, or roost urine taken from an individual bat or bat roost, which represent the biomes for our assemblage of interest. These methods require the assemblage of viruses under sampling to be closed for valid inference, that is, that the assemblage size and composition remained stable throughout the course of the study, an assumption we felt was justified. Although each of these sample types targets a unique biome of potential viral habitat from the host species, each with potentially differing efficacy for detecting any given virus, for the purposes of our analyses, we considered each sample a random and equivalent draw from the assemblage of viruses associated with this host species. We also assumed sample independence, even though multiple samples (e.g., urine and throat) were often drawn from the same individual host and sampled bat populations are likely to be geographically nonrandom. The consequence of this sampling strategy is that our analysis is blind to this additional source of geographical variation and occasional pseudoreplication, which means our virus accumulation results are specific to our sampling methodology and our extrapolations assume ongoing sampling with a similar average composition of samples. The results of additional analyses in which we isolated sample types and individuals and considered geographic variation are not presented herein.

Nucleotide sequence accession numbers. The GenBank accession numbers for viruses discovered in this study are KC692400 to KC692452.

Supplemental material for this article may be found at http://mbio.asm.org /lookup/suppl/doi:10.1128/mBio.00598-13/-/DCSupplemental. Text S1, DOCX file, 0.1 MB. Figure S1 , PDF file, 0.1 MB. Figure S2 , PDF file, 0.3 MB. Table S1 , PDF file, 0.6 MB.

Factors in the emergence of infectious diseases

Host range and emerging and reemerging pathogens

Global trends in emerging infectious diseases

Global habitat suitability models of terrestrial mammals

The changing faces of pathogen discovery and surveillance

Divergent lineage of a novel hantavirus in the banana pipistrelle (Neoromicia nanus

A distinct lineage of influenza A virus from bats

Hantavirus in bat

Virus taxonomy: classification and nomenclature of viruses. Ninth report of the International Committee on Taxonomy of Viruses

Sufficient sampling for asymptotic minimum species richness estimators

Pairs-a FORTRAN program for studying pair-wise species associations in ecological matrices

The empirical Bayes approach as a tool to identify non-random species associations

Temporal trends in the discovery of human viruses

Human viruses: discovery and emergence

Species estimation and applications

The completeness of taxonomic inventories for describing the global diversity and distribution of marine fishes

Biological diversity: frontiers in measurement and assessment

Prediction and prevention of the next pandemic zoonosis

Mammal species of the world

Henipavirus infection in fruit bats (Pteropus giganteus)

Identification of GBV-D, a novel GB-like flavivirus from old world frugivorous bats (Pteropus giganteus) in Bangladesh

GB virus type C: a beneficial infection?

Impact of HIV infection on the epidemiology of tuberculosis in a peri-urban community in South Africa: the need for age-specific interventions

Coinfection rates in 6 bacteriophage are enhanced by virus-induced changes in host cells

Viral coinfections in children with invasive pneumococcal disease

Species interactions in a parasite community drive infection risk in a wildlife population

Culex flavivirus and West Nile virus mosquito coinfection and positive ecological association in Chicago, United States. Vector Borne Zoonotic Dis

Diseases and causes of death in European bats: dynamics in disease susceptibility and infection Rates

Simultaneous infection of healthy people with multiple human cytomegalovirus strains

Variation within the glycoprotein B gene of human cytomegalovirus is due to homologous recombination

Recombination in alphaherpesviruses

Divergence and recombination of clinical herpes simplex virus type 2 isolates

A potentially fatal mix of herpes in zoos

Molecular characterization of varicella-zoster virus clinical isolates from 2006 to 2008 in a tertiary care hospital, Dublin, Ireland, using different genotyping methods

The genome of murine cytomegalovirus is shaped by purifying selection and extensive recombination

Intrauterine transmission of cytomegalovirus to infants of women with preconceptional immunity

Emergence of fatal avian influenza in New England harbor seals

Nipah virus-associated encephalitis outbreak, Siliguri

Outbreaks of Nipah virus in Rajbari and Manikgonj

Recurrent zoonotic transmission of Nipah virus into humans

Nipah outbreak in Faridpur District

Genomic characterization of Nipah virus

Date palm sap linked to Nipah virus outbreak in Bangladesh

Tioman virus, a novel paramyxovirus isolated from fruit bats in Malaysia

Serological evidence of possible human infection with Tioman virus, a newly described paramyxovirus of bat origin

On SARS type economic effects during infectious disease outbreaks. Policy research working paper WPS 4466

Investigating the role of bats in emerging zoonoses: balancing ecology, conservation and public health interests

Identification of a severe acute respiratory syndrome coronavirus-like virus in a leafnosed bat in Nigeria

Sensitive and broadly reactive reverse transcription-PCR assays to detect novel paramyxoviruses

Characterization of an outbreak of astroviral diarrhea in a group of cheetahs (Acinonyx jubatus)

Detection and analysis of six lizard adenoviruses by consensus primer PCR provides further evidence of a reptilian origin for the atadenoviruses

Novel polyomavirus detected in the feces of a chimpanzee by nested broad-spectrum PCR

Identification and characterization of a new bocavirus species in gorillas

Detection and analysis of diverse herpesviral species by consensus primer PCR

Identification of Dobrava, Hantaan, Seoul, and Puumala viruses by onestep real-time RT-PCR

MUSCLE: multiple sequence alignment with high accuracy and high throughput

A proposal for new criteria for the classification of hantaviruses, based on S and M segment protein sequences

SciPy: open source scientific tools for Python

Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness

Estimating species richness

Package "fossil

R: a language and environment for statistical computing. R Foundation for Statistical Computing

People, pathogens and our planet