key: cord-1052965-lgzjnivt authors: Fischer, K.; Urisman, A.; DeRisi, J. title: Diagnostic Techniques: Microarrays date: 2008-07-30 journal: Encyclopedia of Virology DOI: 10.1016/b978-012374410-4.00704-4 sha: 22f7048cf560e7e3fd6a429d574eaa7a724df1dc doc_id: 1052965 cord_uid: lgzjnivt Current techniques for viral detection and discovery, which include culture and serological methods as well as polymer chain reaction (PCR)-based protocols, possess a variety of inherent limitations. In an effort to augment the capabilities of existing diagnostic methodologies, the use of virus-specific DNA microarray technology has been recently applied in both research and clinical settings with favorable results. The primary advantage of this approach is that DNA microarrays containing literally thousands of virus-specific sequences allow simultaneous testing for essentially all known viral species. While previous methods have been limited to testing for a single pathogen, or small numbers of specific pathogens, a panviral assay is less biased and can be designed to detect variant or even novel members of existing viral families. The use of DNA microarrays in both research and clinical settings has the potential to substantially increase the number of instances in which a virus may be identified in biological samples. With the proper bioinformatics methodologies, this technology will also maximize the probability that previously uncharacterized pathogens are detected potentially leading to an improved understanding of the etiology for many chronic human diseases. This article focuses on all aspects of virus-specific DNA microarray implementation, from array design to sample processing, amplification, and data analysis. Numerous human diseases exist for which viral etiologies are suspected, yet specific causal agents are not known. Among these are up to 20% of cases of acute hepatic failure, up to 35% of cases of acute aseptic meningitis and acute encephalitis, up to 50% of cases of acute respiratory infections, and numerous other conditions. In addition, infectious agents may be involved in the pathogenesis of a number of chronic conditions, most notably disorders such as chronic inflammation, autoimmune and degenerative conditions, as well as some forms of cancer. While it is unlikely that viruses cause all of these diseases, identifying causative agents in even a modest number of disorders will have profound implications for understanding, diagnosis, and treatment of these conditions. New approaches to viral diagnostics and discovery are needed to overcome the shortcomings of existing methods. These methods include viral culture, electron microscopy, serology, specific polymer chain reaction (PCR)-based methods, and techniques based on subtractive hybridization. While these methods have been critical for identifying many important human and nonhuman pathogens, each of these methods has intrinsic limitations. For example, many viruses are refractory to culture. Inspection by electron microscopy may prove difficult depending on the titer and morphological features of the virus. Serology-and PCR-based techniques are highly specific methods, but their specificity frequently renders them ineffective for the detection of variant or novel viral species. In the case of PCR, there exist dozens if not hundreds of variations on the method which may extend the scope of the assay, usually through multiplexing or by the use of degenerate primers, yet the number of possible targets that can be interrogated remains small relative to the number of known viral pathogens. Finally, subtractive hybridization techniques, while unbiased, are difficult to troubleshoot and essentially impossible to scale up for high throughput. While there exist many different forms of DNA microarrays, produced by both researchers and corporations, they fundamentally share the same properties. All microarrays exist as some form of solid substrate, typically glass or silicon, to which is bound different species of nucleic acid. Each microarray may contain 10 5 -10 6 different species of DNA, arranged in a grid. Fluorescently labeled nucleic acid derived from any biological sample may be interrogated by hybridization to the microarray. In this manner, the abundance of thousands of different nucleic acid species may be simultaneously measured. The most common application of microarray technology is the measurement of relative gene expression, often for entire genomes. While gene expression profiling has enjoyed tremendous success over the last decade, it has long been realized that the microarray format would be amendable to the detection of exogenously derived nucleic acids for purpose of identifying the presence of pathogens in a background of host material. The general concepts and methodologies are similar to those required for expression profiling; however, there are key differences to consider for the design of the array and the mechanics of sample processing. Furthermore, most implementations of virus-detection microarrays strive only to determine the presence of viral sequences, rather than attempting to quantify the amount of virus in a particular sample. In practical terms, this allows more aggressive amplification strategies to be used during sample preparation. Algorithms to guide the process of sequence selection for expression microarrays are well developed, and many of the design principles apply directly to microarrays for viral detection. The parameters shared in common include physical properties of the oligonucleotide itself, such as the propensity of the sequence to form hairpins, melting temperature, and sequence complexity. Beyond this, the design considerations for viral detection and expression profiling differ substantially. First, the majority of microarrays intended for viral detection are designed to specifically detect the products of a multiplex PCR reaction using specific primers. In this role, the array simply sorts the product of amplification. The design specifications for these types of arrays are straightforward and are primarily guided by the choice of flanking PCR primers for virus amplification. This configuration of array does not exploit the full potential of the microarray for panviral detection. However, the design parameters for a generalized virus detection chip that does not rely on specific multiplex PCR primers must take into account additional factors. While it is essential that probes for expression arrays are unique with respect to target sequence (to prevent cross hybridization from other mRNA species), the same does not necessarily apply to viral probes. In fact, to maximize the probability of detecting any viral species from a known family, it is often desirable to choose sequences that are the most conserved among a group of viruses. For example, while there exists a large range of sequence diversity among human rhinoviruses, sequences in the 5 0 UTR are highly conserved, even among more distant picornaviruses. These sequences may thus serve as a type of universal 'hook' to capture both existing and new variants of these RNA viruses. In case of novel pathogens, such as the SARS coronavirus, the use of conserved sequences was a key determinant for successful detection by microarray. Clearly, unique species-specific or even genotype-specific oligonucleotides can further augment the discrimination of the microarray. A logical extension of this strategy takes into account features of viral taxonomy. Rather than choosing all conserved, or all unique sequences, one may attempt to cover, by design, each level of the taxonomic tree for each family. Thus, some sequences would be chosen to be species specific (terminal nodes on the tree), some would be genus specific, and so on. Various bioinformatic tools, including online databases, are currently available to assist in such design efforts. In all cases, it is critical to prescreen each choice for matches within the human genome to prevent inappropriate cross-hybridization to host material. A more simple approach is to simply tile overlapping oligonucleotides spanning the entire genome of the viruses in question. This approach is appropriate for relatively small panels of viruses since each species will result in large numbers of sequences, depending on genome length. In general, this approach becomes impractical when extended beyond a few viral species. After satisfying the basic design requirements, more sophisticated considerations may also contribute to choice of viral sequence for representation on a microarray. For example, to enhance detection of latent herpesviruses, it may be advantageous to overrepresent sequences specific for genes specific for latent phase expression, rather than those involved in lytic processes. In this case, it is assumed that the RNA rather than DNA will be analyzed, which highlights the importance of sample processing and amplification considerations. The protocol by which nucleic acids are isolated from specimens, and the subsequent amplification of the material, if needed, is also important to consider, as this may also affect both microarray and experimental design. In general, isolation of total RNA is the preferred and more conservative route. While this may seem biased toward RNA viruses, all DNA viruses produce mRNA as part of their lifecycle, so this choice does not exclude them. However, if viral particles are collected, or host material is removed from the specimen by filtration or other size-selection techniques, it may be advisable to isolate total nucleic acid (both RNA and DNA) to maximize sensitivity. The origin of the sample also bears on this issue. When processing a relatively acellular material, such as cerebral spinal fluid, a total nucleic acid extraction would be appropriate, whereas in the case of a solid organ, such as liver or brain, an RNA extraction would avoid the unnecessary complexity brought by co-purifying massive quantities of the host genomic DNA. After nucleic acid has been isolated, an amplification step is typically employed to generate sufficient quantities for successful microarray hybridization. In certain situations, where large amounts of primary material are available, the yield of nucleic may be such that an amplification step may be bypassed altogether, but these situations are the exception. The choice of an amplification strategy is closely linked to the design of the array itself. Several virus-detection microarrays have been designed to serve as detectors for the products of multiplex PCR amplification strategies, but in these cases, the broad spectrum and unbiased nature of the microarray is not realized. For panviral microarrays, where no assumptions are made as to the probable identity of the target, a general, randomized amplification strategy is required. Numerous random amplification strategies exist, but all begin with a priming step using a randomized oligo of various lengths, ranging from 6 to 15 bp. At this point, PCR adapters may be added, either by ligation, or through priming via a common sequence linked to the random primer. Alternatively, various RNA polymerase promoter sequences may be appended which then allow linear amplification. For all these methods, contamination with previously isolated material is a critical concern and good laboratory practices and appropriate controls, such as nontemplated amplification reactions, must be an integral part of the protocol. The overall complexity of the sample, with respect to the viral species, is a critical factor for the success of random amplification strategies. As previously noted, the ideal samples are those that contain relatively low amounts of host cellular material and high titers of virus. For many biological specimens, it may be possible to reduce the complexity by filtration, centrifugation, or by pretreatment of the raw material with various nucleases. In the latter case, free host material such as genomic DNA and ribosomal RNA may be degraded, yet viral-packaged nucleic acid will escape destruction. Such 'preprocessing' can significantly enhance the signal to noise of the final microarray assay, although these steps add both cost and complexity to the overall protocol. Potentially, the use of novel microfluidic techniques for particle size discrimination may serve as a rapid and reproducible way to deterministically reduce complexity in biological samples. Analysis of hybridization results depends greatly on the design of the experiment and of the DNA microarray in particular. Microarrays that are narrowly targeted, for the purposes of distinguishing between strains of a particular virus species, for example, human influenza or smallpox, are subjected to different analysis techniques than are multigenus, multifamily, or even panviral arrays. The utility of DNA microarrays in all these fields is demonstrable when investigators use appropriate analysis techniques. In strain and species typing applications there are usually ample controls available to researchers. By use of control hybridizations, characteristic species or strain hybridization patterns can be classified manually. Often these patterns are the result of iterative microarray design where features with the desired specificity are spatially clustered on the chip. These patterns can be used as templates for visual inspection of the experimental arrays where different classes of microarray features are enumerated and the input sample is thereby placed into one of the known groups or unclassified as ambiguous. Machinelearning techniques, such as probabilistic neural networks, are also sometimes used in these studies, but to date have not been show to be more accurate than simple enumerative methods. Broader explorations of viral populations, for example, panviral studies, must face the problem that the number of control samples available is small when compared to the number of distinct viral genomes that may be encountered. Many viral targets are extremely mutable, introducing one or more mutations per virus genome for every cycle of replication. Such diversity quickly outstrips the resources of any effort to perform exhaustive control hybridizations. In panviral studies standard methods such as hierarchical clustering and several particular bioinformatic approaches can be employed. First the microarray should be designed to include conserved regions of the target viruses to minimize the chance that a divergent but related virus will escape detection. Second, estimates of the hybridization patterns expected from the possible targets for which sequence data is available can be generated using biophysical models of hybridization. Experimental hybridization patterns can then be compared to the model profiles using a correlation metric. If a virus present in a hybridization shares homology with one of the sequences used to estimate the hybridization profiles, this similarity will be reported as a significant correlation between the profile and the experimentally observed pattern. In the last method, the intensity history of the microarray's features can be compared to an experimental hybridization. Extraordinarily strong signal at a particular feature can indicate bona fide virus, especially when signal at taxonomically related features is similarly elevated. During the design stage of the experiment, it is important to consider that in multifamily, metagenomic studies it is critical to have negative controls to characterize the background of microarray designed with broad specificity, while in studies with narrower focus a significant number of positive controls are needed for each class of virus being considered. To date, only a few studies have examined the use of viral-detection microarrays using actual prospectively collected patient samples, as opposed to viruses cultured in the lab or previously characterized retrospective samples. As of the time of this writing, no large-scale study has been published comparing the use of virus microarrays (using a random amplification) to traditional laboratory medicine diagnostics, such as commercial DFA kits and PCR assays. Preliminary data and studies using modest numbers of clinical samples suggest that a microarraybased viral diagnosis outperforms conventional DFA assays, both in terms of sensitivity and specificity. In comparison to a PCR assay using specific primer pairs, microarray assays are likely to have comparable specificity, yet the sensitivity is anticipated to be somewhat lower, depending on amplification strategy. However, microarray-based assay coupled to a random amplification protocol provides much broader detection capabilities, often including essentially every known viral pathogen. When considering the potential of DNA microarrays for clinical diagnostics, it may be the case that the microarray will serve a complementary role to specific PCR assays, especially when the latter fails to yield positive results. For clinical cases where conventional diagnostic assays have failed, the use of a panviral microarray assay may allow identification of new, or unanticipated pathogens. In a recent case report, Chiu and colleagues reported the diagnosis of a previously healthy 28-year-old woman suffering from a severe respiratory tract infection of unknown etiology. Extensive panels of diagnostics for bacteria, fungi, and viruses failed to reveal any positive results, and all viral cultures remained negative. DNA microarray analysis of RNA isolated from endotracheal aspirate revealed the presence of human parainfluenza virus-4, a virus that is not normally included on standard DFA panels or PCR tests. Viral sequence was recovered directly from the patient sample confirming the identity of the virus, and it was further shown that the patient seroconverted during the time of the illness. While it is generally believed that human parainfluenza virus-4 causes only mild, self-limiting infections, this data taken together suggests that the spectrum of disease may extend to respiratory failure in an otherwise healthy adult. This example demonstrates an attractive feature of the microarray assay, namely, the power to detect the unexpected. In addition to the detection of previously known pathogens, DNA microarrays have also been effective for the detection of novel viral species. In the case of SARS, total nucleic acid from a supernatant from an infected vero cell culture revealed a coronavirus signature consisting of oligonucleotides originating from avian infectious bronchitis, human and bovine coronaviruses, and, interestingly, several astroviruses. At first glance, the hybridization signal from astrovirus-derived oligonucleotides would seem to be aberrant. In fact, this is expected, since several astroviruses and coronaviruses share conserved sequences at the 3 0 end of their genomes. These particular sequences were represented on the microarray since the panviral design algorithm purposely selected conserved sequences within and among viral families. The same principles applied to a separate study in which a novel xenotropic gamma retrovirus was detected in prostate tumor biopsies of men with a mutant variant of the RNASEL gene. Integration sites and full-length genomes were subsequently cloned and the virus was demonstrated to be replication competent, thus validating the microarray result. Again, the broad-spectrum nature of the DNA microarray was critical to the success of the project, since there were no preconceptions that such a virus might be a candidate, given that no xenotropic gamma retrovirus had been previously observed in a human subject. Several important limitations of using microarrays for viral detection and discovery should not be overlooked. The most important limitation of the approach is its reliance on known viral sequences. Although most of the novel viruses discovered in the last decade share homology with previously known viruses, viruses lacking even short regions of homology cannot be detected by any hybridization-based method. In the case of profoundly divergent viruses, more brute-force approaches, such as shotgun sequencing, are likely to be applicable. Other existing and likely surmountable limitations of the microarray-based methods are their cost, the need for specialized equipment, and access to computational resources. It is likely that these limitations will become less pronounced as streamlined versions of the technology become available through academic and commercial efforts. It may also be the case that the utility of DNA microarrays may be surpassed by next-generation, massively parallel shotgun sequencing technologies, which would permit cheap, fast, and unbiased analysis of clinical samples. Currently, a substantial fraction of human disease, with a presumed viral cause, goes without a clinical diagnosis. This is especially true for common ailments, such as upper respiratory tract infections, where despite advances in PCR assays, the etiology of 30-60% of infections remains unidentified. Without considering the complexities of diagnostic regulatory approvals, an unbiased DNA microarray approach to the detection of viral pathogens should substantially increase the number of successful diagnoses, and as a consequence, may lead to improved therapeutics and supportive care. It should be noted that use of a virus microarray extends beyond the analysis of clinical samples. The wide scope of detection and the power to discover new pathogens has broad application to agriculture, veterinary medicine, ecology, and environmental genomics. While it is impossible to accurately predict the number of undiscovered viral pathogens remaining on this planet, tools such as virus-detection DNA microarrays permit rapid inroads into this fascinating and important aspect of virology. See also: Diagnostic Techniques: Plant Viruses; Diagnostic Techniques: Serological and Molecular approaches. Microarray detection of human parainfluenzavirus 4 infection associated with respiratory failure in an immunocompetent adult Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays Database to dynamically aid probe design for virus identification E-Predict: A computational strategy for species identification based on observed DNA microarray hybridization patterns Identification of a novel gammaretrovirus in prostate tumors of patients homozygous for R462Q RNASEL variant Microarray-based detection and genotyping of viral pathogens Viral discovery and sequence recovery using DNA microarrays Identifying influenza viruses with resequencing microarrays Cylindrical inclusions (CI) CI are induced in infected cells by all viruses belonging to the family Potyviridae. The term pinwheel inclusion is also widely used, because it describes the typical appearance of the CI in cross sections. CI are composed of one virus-encoded 66-75 kDa nonstructural protein which aggregates to monolayer sheets forming a complicated structure in which several curved sheets are attached to a central tubule. Details of the CI architecture are specific for the virus species and can serve as an additional identification feature of the virus. Immunosorbent techniques (e.g., enzyme-linked immunosorbent assay, immunosorbent electron microscopy) Techniques in which a viral antigen is trapped on a solid matrix by means of specific antibodies that are bound to the matrix by adsorpiton. Plant virus vectors Plant viruses can be transmitted specifically by various vector organisms, for example, aphids, mites, nematodes, or plasmodiophorid protozoans (Olpidium sp., Polymyxa sp.). Viroplasm Cytoplasmic inclusions induced by members of the Caulimoviridae, Rhabdoviridae, Reoviridae, and Bunyaviridae are about the size of nuclei and are not bound by any membrane. The viroplasm consists of amorphous and/or fibrillar material and may or may not enclose immature or mature virus particles. It is generally assumed that viroplasms are the site of virus synthesis.