key: cord-0880250-fz96y65d authors: Cummings, Matthew J; Tokarz, Rafal; Bakamutumaho, Barnabas; Kayiwa, John; Byaruhanga, Timothy; Owor, Nicholas; Namagambo, Barbara; Wolf, Allison; Mathema, Barun; Lutwama, Julius J; Schluger, Neil W; Lipkin, W Ian; O’Donnell, Max R title: Precision Surveillance for Viral Respiratory Pathogens: Virome Capture Sequencing for the Detection and Genomic Characterization of Severe Acute Respiratory Infection in Uganda date: 2018-08-07 journal: Clinical Infectious Diseases DOI: 10.1093/cid/ciy656 sha: d259dce085930f831390ee1f66b1b33504611b2a doc_id: 880250 cord_uid: fz96y65d BACKGROUND: Precision public health is a novel set of methods to target disease prevention and mitigation interventions to high-risk subpopulations. We applied a precision public health strategy to syndromic surveillance for severe acute respiratory infection (SARI) in Uganda by combining spatiotemporal analytics with genomic sequencing to detect and characterize viral respiratory pathogens with epidemic potential. METHODS: Using a national surveillance network we identified patients with unexplained, influenza-negative SARI from 2010 to 2015. Spatiotemporal analyses were performed retrospectively to identify clusters of unexplained SARI. Within clusters, respiratory viruses were detected and characterized in naso- and oropharyngeal swab samples using a novel oligonucleotide probe capture (VirCapSeq-VERT) and high-throughput sequencing platform. Linkage to conventional epidemiologic strategies further characterized transmission dynamics of identified pathogens. RESULTS: Among 2901 unexplained SARI cases, 9 clusters were detected, accounting for 301 (10.4%) cases. Clusters were more likely to occur in urban areas and during biannual rainy seasons. Within detected clusters, we identified an unrecognized outbreak of measles-associated SARI; sequence analysis implicated cocirculation of endemic genotype B3 and genotype D4 likely imported from England. We also detected a likely nosocomial SARI cluster associated with a novel picobirnavirus most closely related to swine and dromedary viruses. CONCLUSIONS: Using a precision approach to public health surveillance, we detected and characterized the genomics of vaccine-preventable and zoonotic respiratory viruses associated with clusters of severe respiratory infections in Uganda. Future studies are needed to assess the feasibility, scalability, and impact of applying similar approaches during real-time public health surveillance in low-income settings. Recent decades have seen advances in the global control of infectious diseases, such as human immunodeficiency virus (HIV)/ AIDS and malaria [1] . However, acute lower respiratory tract infections remain responsible for nearly 3 million annual deaths worldwide, predominantly in children and the elderly, without a substantial decrease in estimated mortality over the past 2 decades [2] . In addition, outbreaks of severe acute respiratory infection (SARI) associated with novel viruses (such as avian influenza viruses and the Middle East respiratory syndrome [MERS] and severe acute respiratory syndrome [SARS] coronaviruses) and periodic outbreaks caused by vaccine-preventable viruses (such as measles viruses) continue to threaten global health security [3] . In many circumstances, traditional public health surveillance systems in low-and middle-income countries have failed to detect or control these outbreaks, because they are not designed to capture high-resolution data on subpopulations of interest and may lack capacity to rapidly characterize highly divergent or novel pathogens [4] . In sub-Saharan Africa, a region accounting for nearly half of global mortality from acute respiratory infections, public health programs have historically relied on data collected from population-based and sentinel surveillance systems [5] [6] [7] [8] . For syndromic SARI surveillance, such programs accumulate incidence data from multiple sources across a catchment area, after which they use a variety of nonstatistical methods to detect a public health signal of concern. These signal detection methods, challenged by difficulties adjusting for natural temporal and geographical variation and adequate population-at-risk data, are limited by poor specificity and imprecise spatiotemporal resolution [5] [6] [7] [8] . Following signal detection, molecular diagnostics such as agent-specific or multiplexed polymerase chain reaction (PCR) may be employed, but these are restricted by limited numbers of target sequences and lack genomic characterization and depth needed to understand complex transmission dynamics. Precision surveillance applies spatiotemporal analytics to routinely collected surveillance data to identify loci with increased risk of disease (ie, SARI) incidence, and combines this data with genomic sequencing to provide sensitive detection and high-resolution characterization of circulating pathogens [8] [9] [10] [11] . This approach could allow for rapid detection and characterization of epidemic-prone respiratory viruses prior to large-scale emergence, identification of populations at risk for severe outcomes, and targeted investment of public health resources (ie, vaccination) among heavily affected subpopulations [8] [9] [10] [11] . Here we present proof-of-concept application of a precision approach to public health surveillance. Using data from a national SARI surveillance network, we retrospectively applied spatiotemporal analysis to identify clusters of unexplained, influenza-negative SARI in Uganda from 2010 to 2015. We combined this approach with a minimally biased viral oligonucleotide probe capture and high-throughput sequencing platform to detect and characterize circulating viral respiratory pathogens associated with these clusters. The Uganda Virus Research Institute (UVRI) conducts prospective SARI surveillance at 8 sentinel site hospitals nationwide. Geographically diverse sentinel sites and SARI case definitions were chosen in accordance with World Health Organization (WHO) protocols [12, 13] . Eligible cases were patients age ≥2 months presenting to surveillance sites who met SARI case definitions. For patients aged 2 months to <5 years, SARI was defined as a syndrome of acute onset of difficulty breathing or cough within 10 days of symptom onset plus an additional clinical indicator of respiratory distress and disease severity sufficient to lead to hospitalization. For patients aged ≥5 years, SARI was defined as a syndrome of fever (measured or subjective) plus cough or shortness of breath within 10 days of symptom onset and disease severity sufficient to lead to hospitalization [13] . At each sentinel site, clinicians interviewed SARI cases, collected clinical and demographic data (including home address), and completed a study questionnaire. Nasopharyngeal and/or oropharyngeal swab samples were collected from each patient and shipped to UVRI using an established protocol [12] . For all SARI cases, testing was initially done for influenza A and B viruses using real-time reverse transcription polymerase chain reaction (RT-PCR) with primers provided by US Centers for Disease Control and Prevention [12] . As we have previously reported on the spatiotemporal dynamics of influenza in Uganda, for the purposes of this study, only patients with SARI of unexplained etiology (ie, influenza A and B negative by RT-PCR) were included in the analysis [14] . As a sampling strategy to identify potential high-risk loci of viral respiratory pathogen circulation, we used the space-time permutation statistic included in the SaTScan software package (version 9.4.2, Boston, MA, USA) to detect spatiotemporal clusters with greater than expected numbers of SARI cases [15] . Originating at the location of each SARI case (geocoded according to self-reported home address at the time of SARI illness onset) the statistic applies a likelihood function to cylindrical spatiotemporal scanning windows with dimensions of increasing distance and time period, each of which represents a potential cluster [15] . Using Monte Carlo hypothesis testing, the statistic compares expected versus observed case counts inside and outside the scan windows to detect clusters that are least likely to have occurred by chance [15] . Expected case counts are determined using data aggregated across the entire data set, both before and after the detected cluster [15] . The maximum spatial and temporal event magnitudes were set as 30 kilometers and 60 days, respectively. Clusters were considered significant at P ≤ .05. Additional methodologic detail is included in the Supplementary Material. Naso-and/or oropharyngeal swab samples from SARI cases with illness onset during spatiotemporal clusters were analyzed at Columbia University using VirCapSeq-VERT [16] . Using a library of approximately 2 million 50-mer to 100-mer nucleotide-soluble biotinylated probes that cover genomes of all 207 known vertebrate viral taxa, VirCapSeq-VERT positively selects and enriches viral sequences for high-throughput sequencing via probe hybridization [16] . Briefly, following nucleic acid extraction, sequencing libraries were prepared by including the viral sequence capture step within the standard KAPA protocol [16] . Sequencing was performed on the HiSeq 4000 platform (Illumina) and resulted in an average of 400 million reads per lane. Host-filtered reads were assembled de novo using the MIRA assembler (version 4.0); contigs and unique singletons were subjected to homology search using MegaBlast against the GenBank nucleotide database. Sequences that showed poor or no homology at the nucleotide level were screened by BLASTX against the viral GenBank protein database. Based on the identified contigs, GenBank sequences were downloaded and used for mapping the whole data set to recover partial or complete genomes. Phylogenetic analyses were done using Mega7 (version 7.0) [17] . Additional methodologic detail is included in the Supplementary Material. Fisher exact or χ 2 tests were used to compare categorical variables and medians were compared using the Mann-Whitney U test. Univariable and multivariable logistic regression models were used to calculate odds ratios and 95% confidence intervals (SPSS version 23.0, IBM). In the course of routine public health surveillance, verbal consent was obtained from SARI cases ≥18 years and from parents or legal guardians for patients <18 years. This study was approved by institutional review boards at UVRI, the Uganda National Council for Science and Technology, and Columbia University. October 2010 and June 2015, 335 were influenza-positive (8.5%) [14] . Among 3586 influenza-negative SARI cases, 2901 had spatial and temporal data available for analysis. Among 2901 influenza-negative SARI cases, spatiotemporal analysis identified 9 statistically significant clusters accounting for 10.4% (301/2901) of unexplained SARI cases (Tables 1 and 2 ). The median size, radius, and duration of these clusters was 30 cases (interquartile range [IQR] 21-44 cases), 13 kilometers (IQR 9.5-16.7), and 38 days (IQR 23-58 days), respectively. The largest clusters occurred in Wakiso, central Uganda from March to May 2012 and Tororo, eastern Uganda, from July to August 2013 (Table 2, Figure 1 ). SARI cases linked to clusters were more likely to reside in urban areas (P < .001) and be HIV-infected (P < .001) ( Of 301 SARI cases associated with cluster events, nasopharyngeal and/or oropharyngeal swab samples from 199 (66.1%) cases were available to undergo analysis with VirCapSeq-VERT. Due to low nucleic acid quality, 22 samples were omitted from further analysis. Therefore, high-throughput sequencing was performed on samples from 176 patients (Table 2) . Among 176 sequenced samples, we detected genetic evidence of respiratory viruses in 144 samples (81.8%) ( Table 2) . Among these, coinfection with ≥2 viruses was identified in 73 (50.7%). Human rhinoviruses, cytomegalovirus (CMV), respiratory syncytial viruses (RSV), measles virus, and human parainfluenza viruses (HPIV) were the most commonly detected pathogens, accounting for 34.5%, 26.9%, 17.9%, 12.4%, and 11.7% of virus-positive samples, respectively (Table 2 and Supplementary Figure S1 ). Incidences of these cluster-associated viruses were associated with recurrent peaks in national SARI incidence from 2010 to 2015 (Figure 2) , with intercluster periods dominated by circulation of influenza viruses (14) . Within a SARI cluster in Wakiso, central Uganda, we identified a previously unrecognized outbreak of severe measles. Of the 18 cases infected with measles, 15 (83.3%) were infants or young children (median age 1 year [IQR 9 months to 5 years]). Measles was the sole virus detected in 11 cases (61.1%). There were 3 cases coinfected with CMV, 2 with HPIV, and 1 each with RSV-A and rubella virus. One patient (age 2 months) who was coinfected with RSV-A and CMV died. Phylogenetic analysis, based on a 451-nucleotide fragment of the nucleocapsid gene (coordinates 1126-1574, accession number NC001498), revealed that sequences from 13 cases were consistent with genotype B3, the genotype endemic in East Africa [18] . Five samples contained measles virus genotype D4 that was thought to have been eliminated from East Africa in 2009 [18] . The 5 D4 samples were identical (100% nucleotide homology) to sequences identified during a measles outbreak in Manchester, England, in 2011 (GenBank accession number, KT732227) [19] . Epidemiologic investigations revealed that a 35 year-old, unvaccinated British tourist had traveled to the central region of Uganda in January 2012 shortly after developing a febrile illness with rash. This individual was found to have measles infection associated with genotype D4 upon return to England (Dr. Bakamutumaho, personal communication). Within a SARI cluster in Wakiso, central Uganda, we identified a hospital-based cluster of SARI associated with a novel picobirnavirus (PBV), a double-stranded, bisegmented RNA virus. The first case was a 43-year-old female farmer who was hospitalized March 22nd. The second case was a 40-year-old HIV-infected male who worked in the clinical laboratory of the district hospital where the first case was cared for. He was hospitalized March 26th. Both cases survived, and neither had gastrointestinal symptoms. Assembly of sequencing reads from both cases revealed infection with PBV most closely related to PBV previously identified in swine and dromedary camels (GenBank accession numbers, ARK08222.1 and AIY31287.1) [20] . Alignment of a 675 nucleotide fragment of the polymerase gene of each identified PBV showed that sequences were 88.7% homologous; on an amino acid level they were 98.6% identical (3 amino acid differences out of 224). During epidemiologic investigations, both patients denied contact with sick animals. Interrupting transmission through early recognition of cases and identification of high-risk subpopulations has long been recognized as a critical step to stopping outbreaks of infectious diseases [21, 22] . Precision surveillance Table 2 Cluster Event No virus detected (2) Not applicable CoV-OC43 (1) HPIV-3 (1) HMPV (1) EV-C105 (1) HHV-6 (1) No enhances this approach by applying spatiotemporal analytics to routinely collected surveillance data and combines these methods with genomic sequencing to enhance targeted investigation and control of infectious diseases [8, 9] . As associated costs continue to decrease logarithmically, it is likely that genomic sequencing will become part of routine surveillance in the near future. In our study, we combined spatiotemporal analytics with a viral oligonucleotide probe capture and high-throughput sequencing platform. This strategy allowed us to detect and characterize zoonotic and vaccine-preventable viral pathogens within specific subpopulations in Uganda. Although retrospective, our precision surveillance strategy provides proof-of-concept that such an approach can identify and localize previously unrecognized and epidemic-prone pathogens with high genomic resolution [21, 22] . Such a strategy, applied in real-time to routinely collected surveillance data, may enable targeted deployment of vaccination resources to mitigate outbreaks of preventable viral infections, trigger investigation of unusual clusters of novel or highly divergent pathogens, facilitate implementation of infection control measures, and direct surveillance activities to "hotspots" from which outbreaks are most likely to emerge. Over the past 2 decades, high-throughput sequencing platforms have bolstered clinical and public health diagnostics by facilitating minimally biased pathogen detection as well as high-resolution characterization of viral genomics [23] . However, these platforms have been challenged by limitations in sensitivity of pathogen sequencing in real-world clinical samples with highly abundant host and limited microbiological genetic material [16, 23] . Using a novel, minimally biased oligonucleotide probe capture system to positively select and enrich viral sequences among clinical samples, we detected respiratory viruses in nearly 82% of cases and viral coinfection in over 50% of cases [16] . Although pathogens associated with many clusters (human rhinovirus, RSV, and HPIV) could have been detected using multiplexed PCR platforms, use of VirCapSeq-VERT resulted in detection of highly contagious yet vaccine-preventable pathogens (measles, rubella) that would not have been detected using more biased diagnostics. Given advantages in detection yield and impactful genomic analyses, VirCapSeq-VERT should be a powerful molecular tool for global virologic surveillance moving forward. Considering ease-of-use (inclusion of viral sequence capture in standard library preparation and nucleic acid enrichment protocols) and sample cost ($40 in a 20-plex sample format), scalability to national reference laboratories in low-income settings is also realistic [16] . We detected a previously unrecognized outbreak of measles-associated SARI in a distinct region of central Uganda. Despite its elimination from the Americas, our results show that measles remains an important pathogen associated with outbreaks of severe respiratory infections in under-immunized settings such as Uganda, where national measles vaccination coverage ranged from 73% to 82% during the study period [24] . Given the potential benefits of high-dose vitamin A supplementation, heightened clinician awareness for measles, as well as enhanced infection control, should be encouraged among patients presenting with severe respiratory infections in endemic areas [25] . On a global scale, our findings suggest that genotype D4 measles virus circulating in Uganda was likely imported from England, where suboptimal rates of immunization have resulted in recurrent measles outbreaks [26, 27] . We detected a nosocomial cluster of SARI associated with a novel and likely zoonotic picobirnavirus (PBV). Although PBV, a double-stranded RNA virus with a bisegmented genome, has been identified in fecal samples from humans, this represents the first report to our knowledge of PBV infection in Africa and the first report of human infection with this highly divergent PBV [28] . Zoonotic spillover from wild or domestic animals to a farmer and subsequent nosocomial transmission to an immunocompromised laboratory worker is a plausible epidemiologic pathway based on viral sequence and epidemiologic data. Given the array of animal species in which PBV has been detected (poultry, pigs, dogs, monkeys, camels, and snakes), enhanced surveillance is needed to better define the host range, transmissibility, and pathogenicity of this potentially emerging, zoonotic virus [28] . Demographically, SARI cases associated with spatiotemporal clusters were more likely to reside in urban areas with higher population density [29] . This signal is consistent with data suggesting that human population density and increasing population growth in urban areas are significant predictors of infectious disease emergence worldwide [22, 30] . Environmentally, two thirds of SARI clusters originated during rainy season months and districts located within these clusters had higher annual rainfall. As changes to global climate dynamics are likely to impact tropical environments substantially, continued evaluation of the impact of climate variables on the circulation of high-impact respiratory viruses is needed [31] . We detected cytomegalovirus (CMV) in over 25% of virus-positive samples in a largely pediatric population with low reported prevalence of HIV-infection. This is consistent with data reported from Zambia, in which CMV was the most commonly identified virus in upper respiratory tract samples from a large cohort of children with SARI [32] . Although we cannot establish disease causality, given reports of severe CMV-associated respiratory infections among HIV-uninfected children in a low-and middle-income countries, further investigation of the epidemiology of CMV-associated SARI in these settings is warranted [33, 34] . This study had limitations. First, our study, which should be viewed as proof-of-concept, was conducted retrospectively using previously collected surveillance data and laboratory samples. Although our spatiotemporal strategy has shown similar promise to enhance infectious disease surveillance in high-and low-income settings, future studies are needed to assess the real-time performance of our approach in the context of prospective surveillance in a low-income setting [35] [36] [37] . Second, our viral detection strategy relied on upper respiratory tract samples that may reflect carriage and not lower respiratory tract infection. Third, we did not sequence samples from cases occurring outside of spatiotemporally defined clusters. Thus, we cannot compare the frequency and characteristics of viruses in nonclustered participants, which may be similar to those detected using our sampling approach. Fourth, our cluster detection strategy targeting noninfluenza associated SARI was limited by the parameters of our spatiotemporal modeling methods. As most clusters lacked a dominant circulating virus, our approach may have lacked specificity to detect clusters associated with a single circulating pathogen. It is also possible that increased SARI incidence within clusters may have been driven by environmental or epidemiologic factors independent of pathogen transmission. Further work is needed to identify alternative spatiotemporal surveillance strategies that may provide greater pathogen specificity, both for influenza and non-influenza respiratory viruses. Finally, our clinical outcome analyses were limited to a subset of patients, and we lacked detailed clinical data among many cases. Our study demonstrates that precision surveillance strategies can enhance detection and characterization of previously unrecognized, epidemic-prone viral respiratory pathogens. Continued development and evaluation of similar approaches are needed to enhance targeted delivery of public health resources in low-income settings. Supplementary materials are available at Clinical Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author. Accelerating progress on HIV, tuberculosis, malaria, hepatitis, and neglected tropical diseases. Geneva: World Health Organization Estimates of the global, regional, and national morbidity, mortality, and aetiologies of lower respiratory tract infections in 195 countries: a systematic analysis for the Global Burden of Disease Study Emerging respiratory tract infections Traditional and syndromic surveillance of infectious diseases and pathogens Acute respiratory infections Estimates of world-wide distribution of child deaths from acute respiratory infections Public health surveillance: a tool for targeting and monitoring interventions Four steps to precision public health Precision public health for the era of precision medicine Next-generation sequencing for infectious disease diagnosis and management: a report of the association for molecular pathology Detecting disease outbreaks using local spatiotemporal methods Clinic-and hospital-based sentinel influenza surveillance World Health Organization. WHO Global epidemiological surveillance standards for influenza. Geneva: World Health Organization Epidemiologic and spatiotemporal characterization of influenza and severe acute respiratory infection in Uganda A space-time permutation scan statistic for disease outbreak detection Virome capture sequencing enables sensitive viral diagnosis and comprehensive virome analysis MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets Possible interruption of measles virus transmission Assessment of the utility of whole genome sequencing of measles virus in the characterisation of outbreaks Metagenomic analysis of viromes of dromedary camel fecal samples reveals large number and high diversity of circoviruses and picobirnaviruses Emerging infectious diseases and pandemic potential: status quo and reducing risk of global spread Global trends in emerging infectious diseases Microbe hunting United Nations Children's Fund. UNICEF and WHO estimates of national immunization coverage-Uganda Vitamin A for treating measles in children The state of measles and rubella in the WHO European Region Mounting a good offense against measles Picobirnavirus infections: viral persistence and zoonotic potential Avian influenza H5N1 in Africa: an epidemiological twist Epidemiology and seasonality of respiratory tract virus infections in the tropics Characterization of cytomegalovirus lung infection in non-HIV infected children Severe cytomegalovirus infection in apparently immunocompetent patients: a systematic review Effect of ganciclovir for the treatment of severe cytomegalovirus-associated pneumonia in children without a specific immunocompromised state Daily reportable disease spatiotemporal cluster detection Faster detection of poliomyelitis outbreaks to support polio eradication Using the SaTScan method to detect local malaria clusters for guiding malaria control programmes Acknowledgments. The authors thank sentinel site clinicians for their assistance with data and sample collection.