key: cord-283880-lrrkuist authors: Kumar, Arvind; Murthy, Satyapramod; Kapoor, Amit title: Evolution of selective-sequencing approaches for virus discovery and virome analysis date: 2017-07-15 journal: Virus Research DOI: 10.1016/j.virusres.2017.06.005 sha: doc_id: 283880 cord_uid: lrrkuist Abstract Recent advances in sequencing technologies have transformed the field of virus discovery and virome analysis. Once mostly confined to the traditional Sanger sequencing based individual virus discovery, is now entirely replaced by high throughput sequencing (HTS) based virus metagenomics that can be used to characterize the nature and composition of entire viromes. To better harness the potential of HTS for the study of viromes, sample preparation methodologies use different approaches to exclude amplification of non-viral components that can overshadow low-titer viruses. These virus-sequence enrichment approaches mostly focus on the sample preparation methods, like enzymatic digestion of non-viral nucleic acids and size exclusion of non-viral constituents by column filtration, ultrafiltration or density gradient centrifugation. However, recently a new approach of virus-sequence enrichment called virome-capture sequencing, focused on the amplification or HTS library preparation stage, was developed to increase the ability of virome characterization. This new approach has the potential to further transform the field of virus discovery and virome analysis, but its technical complexity and sequence-dependence warrants further improvements. In this review we discuss the different methods, their applications and evolution, for selective sequencing based virome analysis and also propose refinements needed to harness the full potential of HTS for virome analysis. Despite being the simplest of biological entity, viruses play an enormously important and complex role in human and animal health, environmental ecology and are also known to shape the evolution of their hosts (Greenbaum and Ghedin, 2015; Koonin and Dolja, 2013; Koonin et al., 2015; Koonin and Wolf, 2012; Mager and Stoye, 2015) . Virus discovery during most of the previous century followed a traditional approach of virus isolation and amplification in cell culture or animal models (Leland and Ginocchio, 2007; Palacios and Oberste, 2005) . These virus isolates were then classified based on their morphological and serological properties (Gelderblom, 1996; Muir et al., 1998) . The widely used cell culture isolation of viruses failed to identify viruses that were refractory to grow in vitro, like hepatitis C virus (HCV) (Choo et al., 1989) .,Moreover, traditional approaches including serological assays and PCR were very specific or focused on a group of viruses. The inherent dependence of these approaches on the biological properties or sequences of known viruses resulted in our limited knowledge of the virus world (Koonin, 2010; Shi et al., 2015) . Use of sequence dependent (i.e; generic PCR assays and microarray) and sequence independent (i.e; single primer amplification (SISPA) and random priming) approaches for nucleic acid amplification combined with Sanger sequencing or HTS allowed the rapid identification of new viruses after 1980 (Bishop-Lilly et al., 2010; Chang et al., 1994; Day et al., 2010; Grard et al., 2012; Kapoor et al., 2015; Ladner et al., 2016; Linnen et al., 1996; Matsui et al., 1991; Mokili et al., 2012; Muerhoff et al., 1997; Nichol et al., 1993; Qin et al., 2014; Quan et al., 2010; Simons et al., 1995b) (Fig. 1) . In SISPA method, a primer-binding sequence(adapter) is ligated to both ends of restriction enzyme digested DNA fragments allowing their subsequent amplification using a single primer. Random PCR uses a slightly different approach where the primer-binding sequence is annealed to the templates using an oligo containing mixture of all four nucleotides at terminal 6-9 positions of the 3' end. However, the comprehensive identification of all viruses in a given sample poses unique challenges due to the tiny and complex nature of viral genomes. Additionally, absence of conserved anchor sequences similar to the bacterial or eukaryotic ribosomal sequences prohibit pan-amplification and sequencing of viruses (Kapoor and Lipkin, 2014 the viral sequences or reduce non-viral sequences while remaining unbiased towards the nature and composition of the virus genome. The term metagenomics can be defined as the study of genetic materials recovered from environmental samples (Handelsman et al., 1998) . The culture-independent nature of this approach allowed discovery of unprecedented microbial diversity that remained underestimated by culture-dependent approaches. This vastly useful information gained by bacterial metagenomics studies, ignited interest of marine virologists to adapt this approach for viral metagenomics (Breitbart et al., 2002) . Soon after, several groups used this approach, or its modified forms, to identify human and animal viruses. The early use of the metagenomics approach for human and animal clinical samples was very successful. New viruses were identified from in vitro cultures supernatants, respiratory secretions, stool suspensions, urine and other body fluids (Allander et al., 2001; Allander et al., 2005; Jones et al., 2005; Kapoor et al., 2008; Victoria et al., 2008) . Metagenomics remains the most widely used approach for new virus identification and virome analysis to date. Here we discuss the use and limitations of virus metagenomics; recently used selective sequencing based virome analysis and provide a perspective on refinements of virus metagenomics based approaches. Simple dideoxy or Sanger sequencing was not very applicable for viral metagenomics but was the only option available before 2005. The process involved plasmid cloning of fragmented or amplified nucleic acids from environmental or clinical samples followed by sequencing of individual clones. Technical and cost constrains allowed sequencing of only a few hundred clones in most of these early studies but even this small scale sequencing revealed presence of vast diversity among viral communities and also led to identification of several human and animal viruses from clinical samples (Allander et al., 2001; Allander et al., 2005; Chang et al., 1994; Choo et al., 1989; Kapoor et al., 2009; Kapoor et al., 2008; Linnen et al., 1996; Matsui et al., 1991; Muerhoff et al., 1997; Nichol et al., 1993; Nishizawa et al., 1997; van der Hoek et al., 2004; Victoria et al., 2008) . These studies highlighted the importance of adapting more efficient sequencing technologies for viral metagenomics. The section below briefly describe different sequencing platforms, their advantages and disadvantages for virus metagenomics based virus identification and virome analysis. This is the first HTS technology used for metagenomics based virus discovery. The principle of 454 (later purchased by Roche) technology is sequencing-by-synthesis chemistry in which DNA molecules are amplified through an emulsion PCR, generating multiple clones of DNA using a single template. Pyrophosphates released during base incorporation are enzymatically converted to a light signal which can be detected using a charged couple devices camera (Droege and Hill, 2008) . 454 sequencing was widely used to identify several new viruses and virome profiles from human and animal samples (Day et al., 2010) including arboviruses (Bishop-Lilly et al., 2010) , orbiviruses , arenaviruses (Palacios et al., 2008) , Lujo virus (Briese et al., 2009) , astrovirus (Quan et al., 2010) , gyroviruses (Phan et al., 2012) , porcine bocaviruses , picornaviruses (Boros et al., 2012) , rhabdoviruses (Grard et al., 2012) , coronaviruses (Honkavuori et al., 2014) , gammapapillomavirus (Phan et al., 2013) and seadornavirus (Reuter et al., 2013) . Most of these viruses were identified from samples like serum, respiratory, and fecal samples. Some studies used this technology to identify viruses from tissue and organ samples-canine adenovirus 1(van der Heijden et al., 2012), bat hepacivirus and pegivirus (Quan et al., 2013) . This method generally allows sequencing of 500,000 to 5 million individual reads that can be between 100 and 450 nucleotides long. Although, this technology offered a higher yield than Sanger sequencing at a lower cost, this technology has been supplanted by other NGS technologies due to its high cost, error rate in homopolymeric regions and low throughput. Life Technologies/Thermo Fisher Scientific released a pH-mediated or semiconductor sequencing technology based Ion personal genomics machine (PGM) sequencer platform in 2010 followed by the Ion Proton and positive (viral probes) selection method using hybridization based capture and selective enrichment of virus nucleic acids, followed by amplification and sequencing. Sequencing data analysis requires bioinformatics based virus discovery or virome analysis and experimental validation of results. A. Kumar et al. Virus Research 239 (2017) 172-179 (2012) and Ion S5 series (2015). These platforms are conceptually similar to the 454 pyrosequencing platform in template preparation and sequencing steps. Adapter-ligated DNA fragments are amplified by emulsion-PCR on the surface of beads which are distributed into microwells where a sequencing-by-synthesis reaction occurs. Protons released during nucleotide incorporation are detected using an ion sensor, which can measure slight shifts in pH (Merriman et al., 2012; Reuter et al., 2015) . Rapid sequencing runs make these sequencers particularly useful for the targeted detection of viruses in clinical samples like HIV (Archer et al., 2012; Chang et al., 2013; Gibson et al., 2014) , hepatitis B virus (Yan et al., 2015) , HCV (Gaspareto et al., 2016; Marascio et al., 2016) , and the rapid genome sequencing of several viruses including the toscana virus (Nougairede et al., 2013) , polyomavirus , porcine reproductive and respiratory syndrome virus (Kvisgaard et al., 2013) , orthoreovirus (Steyer et al., 2013) , bluetongue virus (Lorusso et al., 2014) , rotavirus (Ndze et al., 2014; Nyaga et al., 2014) , influenza virus (Van den Hoecke et al., 2015) etc. Although, some studies used this technology to study viromes in skin (Bzhalava et al., 2013) , ticks (Tokarz et al., 2014; Xia et al., 2015) , gut virome in piglets (Karlsson et al., 2016) and seals (Kluge et al., 2016) , this platform is not the ideal choice for virome study in human clinical samples due to lower outputs. Solexa/Illumina introduces a high throughput platform and is less expensive in cost than the Genome analyzer II platform in 2006 followed by MiSeq (2011), NextSeq 500 (2014) and the HiSeq series (2012) (2013) (2014) . The principle of Illumina sequencing involves reversibletermination sequencing by synthesis (SBS) with fluorescently labelled nucleotides (Liu et al., 2012) . Introduction of these platforms accelerated the rate of virus discovery in humans and animals like severe fever with thrombocytopenia virus (Xu et al., 2011) , bas-congo virus (Grard et al., 2012) , titi monkey adenovirus , canine bocavirus 3 , snake arenaviruses (Stenglein et al., 2012) , human polyomavirus 9 (Sauvage et al., 2011) , simian adenovirus c , theiler's disease associated virus (Chandriani et al., 2013) , human hepegivirus 1 , jingmen tick virus (Qin et al., 2014) , guaico culex virus (Ladner et al., 2016) , protoparvovirus (Phan et al., 2016) , marmota himalayana hepatovirus (Yu et al., 2016) etc. High throughput and low error rates (below 1% and mainly substitutions) are the main reasons that Illumina technologies have dominated the viral discovery field in the past several years. Pacific Biosciences commercialized a single-molecule real-time (SMRT) sequencing platform known as RS II in 2010 followed by Sequel (2015) for generation of long read (average 10-50 kb) without clonal amplification bias. Hairpin adapters are ligated on the end of template DNA molecules for generation of capped templates (SMRT-bell). Zeromode waveguides (ZMW) with a single molecule of DNA and fluorescent nucleotides are affixed with a single DNA polymerase enzyme at the bottom. During polymerization, a fluorescent tag of nucleotides are cleaved off and diffuse out of the observation area of the ZMW and are detected by the detector in real time (Reuter et al., 2015; Rhoads and Au, 2015) . Single molecule sequencing is extremely well suited for virus metagenomics studies, because of the very long reads. It is otherwise nearly impossible to obtain long contigs in a biologically diverse sample, (Brinzevich et al., 2014; Tombacz et al., 2015; Archer et al., 2012; Bergfors et al., 2016; Schleiss et al., 2014; Tombacz et al., 2014; Wittmann et al., 2014) but lower throughput, higher costs per base sequencing and higher error rates, currently limit the scope for the viral metagenomics study of low titer viruses. Oxford Nanopore Technologies released a nanopore based portable sequencer MinION in 2014. The principle of this technology is measuring information about the characteristic changes that are induced as the biological molecules (DNA/RNA) are passing through the nanopore by a molecular motor protein. This is the first portable sequencer with the capability of RNA/DNA sequencing, longer read length (approximate 300 kb) in fewer hours, and real time sequence analysis. The nanopore platform has been used for viral detection in human clinical samples (Greninger et al., 2015; Hoenen et al., 2016) and genome sequencing (Karamitros et al., 2016; Kilianski et al., 2016) . Although this platform looks promising for epidemiological investigation during an outbreak, low outputs and high error rates (around 10%) (McGinn et al., 2016 ) are a concern for virus discovery. Currently high throughput sequencing technologies are capable of producing of millions of sequence reads, but the detection of viruses in clinical samples is still challenging due to the presence of an extremely low amount of viral nucleic acids compared to a high background of host, bacterial and other contaminating genetic material. Therefore, the efficient and sensitive metagenomics study of viruses in clinical or environmental samples requires the removal of non-viral nucleic acids and/or the enrichment of virus-derived nucleic acids (Capobianchi et al., 2013; Conceicao-Neto et al., 2015; Hall et al., 2014) . These virus enrichment methods can be broadly classified as sample preparation methods and sequencing library preparation methods Fig. 1 , or alternatively can be also called pre-extraction (Fig. 1A) and post-extraction virus enrichment methods (Fig. 1B) , respectively. Choice of pre-extraction virus enrichment methods largely depends on the nature of the sample to be analyzed. Generally, the virus metagenomics samples are contaminated with host (bacteria or eukaryote) and environmental nucleic acids. For the virome analysis of environmental samples, several methods including ultra-filtration, iron chloride precipitation and density gradient centrifugation (polyethylene glycol, sucrose cushion) are available for pre-extraction virus enrichment (Andrews-Pfannkoch et al., 2010; John et al., 2011) . For the virome analysis of clinical samples with a low abundance of host cells, like cerebrospinal fluid, respiratory samples, serum, urine or stools, filtration and nuclease digestion or density centrifugation can be used as a method of pre-extraction virus enrichment (Batty et al., 2013; Conceicao-Neto et al., 2015; Daly et al., 2011; Hall et al., 2014; Kohl et al., 2015; Rosseel et al., 2015) . For the virome analysis of clinical samples with an abundance of host cells, like blood or tissues, pre-extraction based enrichment is not appropriate as the virus genome itself can be present in its non-capsidated or transcribed form. Most of the pre-extraction virus enrichment methods are based on the physical properties of virions and their differences from the other living forms. Virus particles or virions are the encapsidated forms of virus genomes. Even the first recognition of virus as an entity in the early 19th century, then recognized as liquid poison, was done on their ability to pass through the Cahmberland filters that can retain most bacteria (Lecoq, 2001) . Except for a few large recently identified viruses , most animal viruses are less than 200-300 nm in diameter and therefore can pass through filters with 0.2-0.45 μm pore size. Filtration is therefore the most commonly used method for selectively sequencing the viruses from environmental or clinical samples. Notably, several metagenomic based virus identification studies used the 0.45 μm filter and not the 0.2 μm filter because the pore diameters of commercial filters are quite variable and using a filter with pore diameter closer to the virus size can reduce the amount of viruses in the filtrate (Conceicao-Neto et al., 2015) . The compact nature of virions allows the use of density gradient centrifugation (sucrose, cesium chloride, poly ethylene glycol) as a method to enrich samples for virus derived nucleic acids. Density centrifugation is commonly used in molecular virology labs and was also used in the first virus metagenomics study for enriching viruses in sea water samples. This method was also used to study viromes of lakes, soils and from extreme environments (Brum et al., 2013; Kleiner et al., 2015; Thurber et al., 2009) . The major limitation of the density centrifugation method is its applicability for analysis of clinical samples. Moreover, the differences in the density of enveloped and non-enveloped viruses and other variables requires the analysis of several gradient fractions. The presence of contaminating nucleic acids in the density gradient solution can further contaminate the samples with environmental nucleic acids. Most viral capsids can protect their genome from degradation by nucleases. This property of virions was frequently used by plant and animal virologists for identification and transmission of viruses (Rosseel et al., 2015; Thurber et al., 2009) . However, the nuclease treatment or nuclease mediated removal of non-capsidated nucleic acids to enrich samples for viral nucleic acids was first described by Allander et al. (Allander et al., 2001) . The study showed that DNAse treatment of virus containing serum samples can help enrich the metagenomic libraries from virus derived sequences. This work described the identification of two novel parvovirus species in bovine serum. Soon after, this group and then several others used a similar approach to identify a plethora of human and animal viruses(add REFS). Although it was very helpful, nuclease treatment is not very effective in degrading unprotected nucleic acids and often fails to enrich samples enough for the identification of low-titer viruses. Moreover, nuclease treatment cannot be used to find viruses in cells or tissue samples where most virus nucleic acids are not in their encapsidated form. In positive selection methods, samples are enriched for viral nucleic acids directly using probes targeting the viruses like in PCR assays, microarray or virus capture (in solution based hybridization) approaches. The simplest examples of the positive selection approach are generic PCR assays which use degenerated primers to target several related viruses or their variants (Irving et al., 2014; Kwok and Chiang, 2016; Simons et al., 1995a) . The PCR based approach is restricted by its ability to detect only a limited number of viruses due to an issue of multiplexity. To avoid this, several groups developed a positive selection approach for enriching samples for viruses of a defined taxonomic group (family, genus and species) based on DNA microarray (Gardner et al., 2010; Palacios et al., 2007; Wang et al., 2002) . DNA microarray has been employed to characterize and discover a number of novel or variant viruses including the human cardioviruses, porcine circovirus, rhinovirus and adenovirus respectively Chiu et al., 2008; Kistler et al., 2007; Palacios et al., 2007; Rota et al., 2003) . However, due to their limited specificity, none of these methods were suitable for the comprehensive characterization of vertebrate viruses or for the analysis of viromes. Most recently, two very comprehensive probe sets were developed for positive selection based sequencing of all known vertebrate viruses and their variants (Briese et al., 2015; Wylie et al., 2015) . These methods use virus-specific probes in a liquid-phase hybridization to capture viral sequences from metagenomics libraries. One of these approaches was termed Virome Capture Sequencing (VirCapSeq) (Briese et al., 2015) . Basically, VirCapSeq is similar to any other DNA hybridization based enrichment method, including the wellknown exon capture method used for transcriptomics. In principle, VirCapSeq is made of set of specific oligonucleotides (or probes), like used on viral microarrays, that when mixed with samples can hybridize or capture complementary viral nucleic acids (Fig. 1B-b) . VirCapSeq includes probes for all viruses known to infect vertebrates by targeting their protein coding regions. However, the design of VirCapSeq was challenging due to the constant influx of genome data from newly identified viruses and the biased abundance of virus genomes of pathogenic viruses, like HIV and HCV in sequence databases. The design of VirCapSeq, therefore required the use of a unique probe set containing 2 million probes or oligonucleotides that differed by at least 10% to reduce sequence redundancy and widen the coverage of virus sequences. Stringent hybridization conditions of VirCapSeq allow specific enrichment of virus-like sequences in libraries generated from different sample types. Compared to common metagenomics and other virus enrichment approaches (filtration and nuclease digestion prior to total nucleic acid extraction and RiboZero rRNA depletion after extraction), the use of VirCapSeq gained 100-10,000-fold more virus sequences in the metagenomics libraries. The sensitivity of VirCapSeq was also compared with real time PCR for detection of viral sequence in blood and serum samples spiked with live enterovirus D68. VirCapSeq was able to detect 10 copies/ml of virus in both sample types which is comparable to the sensitivity of PCR based methods. Interestingly, the use of VirCapSeq allows the identification of viruses whose genomes are vastly different from those used to design the probes, as all viruses of a taxonomic group share some highly conserved stretches of nucleotides. Another similar positive selection approach was termed ViroCap and was simultaneously developed by Wylie et.al. (Wylie et al., 2015) . Like VirCapSeq, ViroCap also targeted most virus species (of 34 virus families) that are known to infect vertebrates and excluded known endogenous retroviruses. To design ViroCap, approximately 1 billion bases of sequence data was condensed to ∼200 million bases of probe sequences. Use of ViroCap allowed 296-674-fold increases in the number of virus reads compared to simple metagenomics sequencing. Although both these positive selection approaches described above are biased towards sequences of known viruses, they can be efficiently used to study mixed infections and provide a sensitive characterization of all viruses present in clinical samples. As the appreciation of the role that viromes play in human health is increasing, these platforms can provide useful tools to study the dynamics of human virome in longitudinal samples collected before and after appearance of diseases. An ideal negative subtraction strategy would allow the removal of all non-viral nucleic acids from samples, including the nucleic acids derived from the host, the reagents and the environment. A negative selection approach can use the principle of suppression subtractive hybridization (SSH) or representational difference analysis (RDA) which allows the comparison of two DNA populations (tester and driver) and enrichment of differentially distributed molecules (Diatchenko et al., 1996; Lisitsyn et al., 1993) . The genomic DNA sample that contains the sequences of interest is referred to as tester and the reference sample is referred to as driver. Tester and driver DNAs are hybridized in excess of driver and the hybrid sequences are then removed. Consequently, the remaining unhybridized DNAs represent DNA molecules that are present in the tester yet absent from the driver DNA. Although these approaches have been usually employed to identify genes with a differential expression pattern (Yin et al., 2013) , but successfully have been used to identify human herpes virus-8 (Chang et al., 1994) , GB virus (Simons et al., 1995b) , TT virus (Nishizawa et al., 1997 ), Walrus Calicivirus (Ganova-Raeva et al., 2004 and a new strain of the murine hepatitis virus (Islam et al., 2015) . Another simpler approach can be based on the simple subtraction of non-target nucleic acid using biotinylated probes ( Fig. 1B-a) . An example of negative selection approach is subtraction of ribosomal RNA (rRNA) from extracted ribonucleic acids. This method is very efficient in finding RNA viruses in samples, like serum and cerebrospinal fluid, where a majority of host RNA is rRNA (Matranga et al., 2014; Rosseel et al., 2015) . Several commercial kits including Illumina's Ribo-Zero rRna removal kit, and Ambion's GLOBINclear kit provide sequence specific (human/mouse/rat/bacteria) biotinylated oligos designed to capture host rRNA or highly expressed gene. These hybrids can be captured on a streptavidin magnetic bead and removed from the sample, enriching these samples for.virus-derived nucleic acids or amplification products. New England Biolab's NEBNext rRNA depletion kit uses a slightly different approach where a DNA oligo binds to rRNA sequences in the samples followed by RNase H mediated selective removal of hybridized rRNA. Recently, a new approach known as DASH (depletion of abundant sequences by hybridization) was described by Gu et al. for depletion of rRNA with Cas9 protein complexed with a library of rRNA (Gu et al., 2016) . However a major limitation is the unavailability of rRNA subtraction reagents for many vertebrate and invertebrate species. Moreover, rRNA subtraction is not promising when looking for viruses in cells and tissue samples that usually contain an abundance of genomic DNA and transcripts . Although not yet available, the impact of an ideal negative selection approach will be far reaching as it will enable not only the sensitive analysis of viromes but also the comparative analysis of changes in viromes within a host or between, or to compare viromes in samples obtained at different time points. Recent advances in sequencing technologies were successfully exploited for virus discovery and virome analysis expanding our knowledge of viruses that live with and around us. However, these approaches are not sufficient to study the virome of samples where the host and environmental nucleic acids are typically present in orders of magnitude higher than the viral nucleic acids. Viruses residing within cells and tissues represent such a scenario where conventional viral metagenomics fail to identify low-titer viruses; however this problem can be solved by increasing the depth of sequencing. However, the vast amount of sequence data generated using a metagenomics approach can pose bioinformatics challenges for the comparative analysis of viromes. Use of appropriate selective sequencing approaches, therefore can not only allow more sensitive identification of viruses but can also help in reducing bioinformatics of non-viral data. Although in positive selection approaches, like VirCapSeq or ViroCap, a very broad selection of probes for all known vertebrate viruses allow a comprehensive analysis of related viruses in clinical samples. The major limitation of this technique is its inherent bias of enriching samples for only known and targeted viruses. Even the most basic virus enrichment approaches like centrifugation and filtration were biased and therefore giant viruses remained unidentified . Some constituents of viromes are either fluctuating or continuously evolving and therefore cannot be studied using a constant probe set. In conclusion, these methods are more appropriate for a selected group of viruses and to study the relative presence and dynamic changes in their populations. In the future, a more focused VirCapSeq assay to study viromes and its fluctuations in specific diseases or body sites like blood, respiratory system, liver or gastrointestinatl tract, holds an enormous potential to gain new insights into the role of viromes in human health and development of chronic diseases. However, we believe that only a simple and efficient negative selection approach will become an ideal selective sequencing approach for the discovery of new viruses and also for meaningful analysis of viromes. A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species Cloning of a human parvovirus by molecular screening of respiratory tract samples Hydroxyapatite-mediated separation of double-stranded DNA, single-stranded DNA, and RNA genomes from natural viral assemblages Identification of a novel cetacean polyomavirus from a common dolphin (Delphinus delphis) with Tracheobronchitis Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism A modified RNA-Seq approach for whole genome sequencing of RNA viruses from faecal and blood samples Analysis of hepatitis C NS5A resistance associated polymorphisms using ultra deep single molecule real time (SMRT) sequencing Arbovirus detection in insect vectors by rapid, high-throughput pyrosequencing Identification and complete genome characterization of a novel picornavirus in turkey (Meleagris gallopavo) Genomic analysis of uncultured marine viral communities Genetic detection and characterization of Lujo virus, a new hemorrhagic feverassociated arenavirus from southern Africa Virome capture sequencing enables sensitive viral diagnosis and comprehensive virome analysis HIV-1 interacts with human endogenous retrovirus K (HML-2) envelopes derived from human primary lymphocytes Assembly of a marine viral metagenome after physical fractionation Unbiased approach for virus detection in skin lesions Next-generation sequencing technology in clinical virology Identification of a previously undescribed divergent virus from the Flaviviridae family in an outbreak of equine serum hepatitis Identification of herpesvirus-like DNA sequences in AIDS-associated Kaposi's sarcoma Rapid deep sequencing of patient-derived HIV with ion semiconductor technology Cross-species transmission of a novel adenovirus associated with a fulminant pneumonia outbreak in a new world monkey colony Identification of cardioviruses related to Theiler's murine encephalomyelitis virus in human infections A novel adenovirus species associated with an acute respiratory outbreak in a baboon colony and evidence of coincident human infection Isolation of a cDNA clone derived from a blood-borne non-A, non-B viral hepatitis genome Modular approach to customise sample preparation procedures for viral metagenomics: a reproducible protocol for virome analysis A viral discovery methodology for clinical biopsy samples utilising massively parallel next generation sequencing Metagenomic analysis of the turkey gut RNA virus community Suppression subtractive hybridization: a method for generating differentially regulated or tissue-specific cDNA probes and libraries The Genome Sequencer FLX System-longer reads, more applications, straight forward bioinformatics and more complete data sets New Calicivirus isolated from walrus A microbial detection array (MDA) for viral and bacterial detection Resistanceassociated variants in HCV subtypes 1a and 1b detected by Ion Torrent sequencing platform Structure and classification of viruses Sensitive deep-sequencing-based HIV-1 genotyping assay to simultaneously determine susceptibility to protease, reverse transcriptase, integrase, and maturation inhibitors, as well as HIV-1 coreceptor tropism A novel rhabdovirus associated with acute hemorrhagic fever in central Africa Viral evolution: beyond drift and shift Rapid metagenomic identification of viral pathogens in clinical samples by real-time nanopore sequencing analysis Depletion of Abundant Sequences by Hybridization (DASH): using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications Viral metagenomics: are we missing the giants? Evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products Nanopore sequencing as a rapidly deployable ebola outbreak tool Novel coronavirus and astrovirus in Delaware Bay shorebirds Development of a highthroughput pyrosequencing assay for monitoring temporal evolution and resistance associated variant emergence Suppression subtractive hybridization method for the identification of a new strain of murine hepatitis virus from xenografted SCID mice A simple and efficient method for concentration of ocean viruses by chemical flocculation New DNA viruses identified in patients with acute viral infection syndrome Virus Discovery in the 21st Century A highly prevalent and genetically diversified Picornaviridae genus in South Asian children A newly identified bocavirus species in human stool Virome analysis of transfusion recipients reveals a novel human virus that shares genomic features with hepaciviruses and pegiviruses De novo assembly of human herpes virus type 1 (HHV-1) genome, mining of non-canonical structures and detection of novel drug-resistance mutations using short-and long-read next generation sequencing technologies The intestinal eukaryotic virome in healthy and diarrhoeic neonatal piglets Use of unamplified RNA/cDNA-Hybrid nanopore sequencing for rapid detection and characterization of RNA viruses Pan-viral screening of respiratory tract infections in adults with and without asthma reveals unexpected human coronavirus and human rhinovirus diversity Evaluation of methods to purify virus-like particles for metagenomic sequencing of intestinal viromes Metagenomic survey of viral diversity obtained from feces of Subantarctic and South American fur seals Protocol for metagenomic virus detection in clinical specimens The wonder world of microbial viruses A virocentric perspective on the evolution of life Evolution of microbes and viruses: a paradigm shift in evolutionary biology? Front Origins and evolution of viruses of eukaryotes: the ultimate modularity A fast and robust method for full genome sequencing of porcine reproductive and respiratory syndrome virus (PRRSV) type 1 and type 2 From conventional to next generation sequencing of epstein-barr virus genomes A multicomponent animal virus isolated from mosquitoes Role of cell culture for virus detection in the age of technology A novel bocavirus in canine liver Tibet Orbivirus, a novel Orbivirus species isolated from Anopheles maculatus mosquitoes in Tibet, China Direct nextgeneration sequencing of virus-human mixed samples without pretreatment is favorable to recover virus genome Cloning the differences between two complex genomes Comparison of next-generation sequencing systems Complete genome sequence of bluetongue virus serotype 1 circulating in Italy, obtained through a fast next-generation sequencing protocol Detection of natural resistanceassociated substitutions by ion semiconductor technology in HCV1b positive, directacting antiviral agents-naive patients Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples The isolation and characterization of a Norwalk virusspecific cDNA Progress in ion torrent semiconductor chip based sequencing Metagenomics and future perspectives in virus discovery Amplification and subtraction methods and their application to the discovery of novel human viruses Molecular typing of enteroviruses: current status and future requirements: the European Union Concerted Action on Virus Meningitis and Encephalitis diverse combinations of the G and P genes and lack of reassortment of the backbone genes Genetic identification of a hantavirus associated with an outbreak of acute respiratory illness A novel DNA virus (TTV) associated with elevated transaminase levels in posttransfusion hepatitis of unknown etiology Isolation of Toscana virus from the cerebrospinal fluid of a man with meningitis in Marseille rotavirus strains from eastern, western and southern africa Enteroviruses as agents of emerging infectious diseases Panmicrobial oligonucleotide array for diagnosis of infectious diseases A new arenavirus in a cluster of fatal transplant-associated diseases A third gyrovirus species in human faeces Novel human gammapapillomavirus species in a nasal swab A new protoparvovirus in human fecal samples and cutaneous T cell lymphomas (mycosis fungoides) A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors Astrovirus encephalitis in boy with X-linked agammaglobulinemia Bats are a major natural reservoir for hepaciviruses and pegiviruses Novel seadornavirus (family reoviridae) related to banna virus in europe High-throughput sequencing technologies PacBio sequencing and its applications Evaluation of convenient pretreatment protocols for RNA virus metagenomics in serum and tissue samples Human polyomavirus related to African green monkey lymphotropic polyomavirus Molecular and biological characterization of a new isolate of guinea pig cytomegalovirus Divergent Viruses Discovered in Arthropods and Vertebrates Revise the Evolutionary History of the Flaviviridae and Related Viruses Isolation of novel virus-like sequences associated with human hepatitis Identification of two flavivirus-like genomes in the GB hepatitis agent Identification, characterization, and in vitro culture of highly divergent arenaviruses from boa constrictors and annulated tree boas: candidate etiological agents for snake inclusion body disease High similarity of novel orthoreovirus detected in a child hospitalized with acute gastroenteritis to mammalian orthoreoviruses found in bats in Europe Laboratory procedures to generate viral metagenomes Virome analysis of Amblyomma americanum, Dermacentor variabilis, and Ixodes scapularis ticks reveals novel highly divergent vertebrate and invertebrate viruses Strain kaplan of pseudorabies virus genome sequenced by PacBio single-Molecule real-Time sequencing technology Characterization of novel transcripts in pseudorabies virus Analysis of the genetic diversity of influenza A viruses using next-generation DNA sequencing Rapid identification of known and new RNA viruses from animal tissues Microarray-based detection and genotyping of viral pathogens First genome sequences of Achromobacter phages reveal new members of the N4 family Enhanced virome sequencing using targeted sequence capture Metagenomic profile of the viral communities in Rhipicephalus spp ticks from Yunnan, China Metagenomic analysis of fever, thrombocytopenia and leukopenia syndrome (FTLS) in Henan Province, China: discovery of a new bunyavirus Deep sequencing of hepatitis B virus basal core promoter and precore mutants in HBeAg-positive chronic hepatitis B patients Differentially expressed genes of human microvascular endothelial cells in response to anti-dengue virus NS1 antibodies by suppression subtractive hybridization Sequence-independent VIDISCA-454 technique to discover new viruses in canine livers Identification of a new human coronavirus A novel hepatovirus identified in wild woodchuck Marmota himalayana Research in the Kapoor lab is supported by National Institutes of Health award (HL119485), National Science Foundation award (FAIN-1619072) and The Research Institute at Nationwide Children's Hospital.