key: cord-0042555-3df9ka1r
authors: Riley, Lee W.
title: Molecular Epidemiology
date: 2009-05-29
journal: Bacterial Infections of Humans
DOI: 10.1007/978-0-387-09843-2_3
sha: 2d1015fe596c2be40c114e24a4963edac02efa7c
doc_id: 42555
cord_uid: 3df9ka1r

Molecular epidemiology is now an established discipline in epidemiology.(1) It is the contemporary stage in the evolution of laboratory-based epidemiology that may have begun with the discovery in the late 1800s of ways to differentiate bacterial organisms by pure culture in artificial media.(2) Molecular epidemiology uses new molecular biology tools to address questions difficult or not possible to address by old laboratory tools. Just as statistical tools have become indispensable in epidemiological investigations and interpretations of epidemiologic data, molecular biology tools today have come to elucidate epidemiologic features of diseases that cannot be easily characterized by conventional techniques. Applied to infectious diseases, molecular biology methods have also come to challenge our traditional notions about the epidemiology of these diseases and have engendered novel opportunities for their prevention and control. This chapter will (1) review definitions commonly used in molecular epidemiology, (2) present an overview of molecular biology methods used to study infectious disease epidemiology, and (3) describe examples of the types of epidemiologic problems that can be addressed by molecular biology techniques, highlighting new concepts that emerged in the process of applying this approach to study bacterial infectious diseases.

Molecular epidemiology is now an established discipline in epidemiology. (1) It is the contemporary stage in the evolution of laboratory-based epidemiology that may have begun with the discovery in the late 1800s of ways to differentiate bacterial organisms by pure culture in artificial media. (2) Molecular epidemiology uses new molecular biology tools to address questions difficult or not possible to address by old laboratory tools. Just as statistical tools have become indispensable in epidemiological investigations and interpretations of epidemiologic data, molecular biology tools today have come to elucidate epidemiologic features of diseases that cannot be easily characterized by conventional techniques. Applied to infectious diseases, molecular biology methods have also come to challenge our traditional notions about the epidemiology of these diseases and have engendered novel opportunities for their prevention and control. This chapter will (1) review definitions commonly used in molecular epidemiology, (2) present an overview of molecular biology methods used to study infectious disease epidemiology, and (3) describe examples of the types of epidemiologic problems that can be addressed by molecular biology techniques, highlighting new concepts that emerged in the process of applying this approach to study bacterial infectious diseases.

Molecular epidemiology: In Chapter 1, epidemiology is defined as ". . .the study of the distribution and determinants of health-related states, conditions, or events in specified populations and the application of the results of this study to the control of health problems". Correspondingly, molecular epidemiology can be defined as "the study of the distribution and determinants of health-related states, conditions, or events in specified populations that utilizes molecular biology methods". In molecular epidemiology of infectious diseases, the word "determinants" is emphasized. Furthermore, as we learn more about the biology of agents of infectious diseases, it is becoming evident that disease distribution in a community of hosts may be determined not only by factors such as the distribution of contaminated vehicles, host risk characteristics, and environmental contexts in which these diseases occur, but also by a pathogen's genetic composition.

For example, communicable bacterial pathogens that cause diarrhea are almost always transmitted by the fecaloral route. Colonization of the intestine is an important prerequisite for these pathogens to facilitate their transmission to new hosts. Similarly, the agent of tuberculosis Mycobacterium tuberculosis, which is transmitted by the airborne route, must cause active disease in the host's lungs to assure its transmission to new hosts. An agent of leptospirosis, Leptospira, prolongs its survival in a community of hosts by establishing carriage in a mammalian host's kidney tubules so that it can be excreted in urine to infect other hosts. Mechanisms that underlie these organisms' ability to infect a specific target organ, which determines the specific pattern of transmission, are genetically determined. Thus, the definition of molecular epidemiology can be expanded to include the following: a study of the genetic factors that determine and regulate an organism's specific mode of transmission among hosts within an environmental context.

Differentiating molecular epidemiology from taxonomy and phylogeny: In February 2009, a bibliographic search by PubMed using the words "molecular epidemiology" generated 27,822 publications that contained these words in the title or the abstract. Most of these publications dealt with A.S. Evans, P.S. Brachman (eds.), Bacterial Infections of Humans, DOI 10.1007/978-0-387-09843-2 3, C Springer Science+Business Media, LLC 2009 the laboratory methods used to subtype microorganisms. They did not discuss the use of these techniques to characterize disease occurrence, distribution, or determinants of disease distribution. Thus, the term "molecular epidemiology" is often used to describe studies that should be more appropriately called "molecular taxonomy" or "phylogeny". Taxonomy is the science of classification of organisms into naturally related groups based on a factor common to each. Phylogeny is the study of lines of descent or evolutionary development of an organism. (1) Molecular epidemiology, taxonomy, and phylogeny may employ the same molecular biology laboratory techniques, but each is based on distinct principles. In taxonomy or phylogeny applied to infectious agents, the data that are generated are used to describe properties of organisms and their relationships to each other, based on an inferred model. Molecular epidemiology seeks to identify relationships between the etiologic agent of a disease (organism causing the disease), the host, and the environment in which the etiologic agent and the host reside. Both epidemiology and phylogeny as applied to infectious diseases may describe the distribution of particular genetic attributes of a pathogen over time. However, in phylogeny, the relationships of these genetic attributes over time must be guessed, since past evolutionary events cannot be empirically confirmed. That is, phylogenetic analysis seeks to infer past evolutionary events based on a model from a set of observations made with materials available in the present. Epidemiology focuses on events of the present or the recent past. It uses analytical and empirical study designs to identify factors that predict the relationship of attributes of pathogens to disease distribution in place, person, and time. It attempts to identify factors that determine disease transmission, manifestation, and progression. In molecular epidemiology, the hypotheses that are generated are testable or can be made to be testable empirically when enough study subjects are identified. Finally, epidemiology is always motivated by an opportunity or possibility for intervention and prevention. Thus, an application of a molecular biology technique to type an organism per se does not by itself constitute molecular epidemiology. However, it should be stressed that the disciplines of taxonomy and phylogeny play an important role in the discipline of molecular epidemiology, and this idea is reviewed in detail in a separate reference. (1) Terminologies introduced in this chapter are based on definitions recommended by the European Study Group on Epidemiological Markers (3) and the Molecular Typing Working Group of the Society for Healthcare Epidemiology of America. (4) Others were adapted from a reference book Molecular Epidemiology of Infectious Diseases: Principles and Practice. (1) The terminologies below are mostly used in ways to apply to prokaryotic organisms.

Discriminatory power: The ability of a laboratory test to generate distinct and discrete units of information from different isolates, usually at a subspecies level.

Isolate: A population of microbial cells in pure culture derived from a single colony on an isolation plate and identified to the species level.

Strain: An isolate or a group of isolates exhibiting phenotypic and/or genotypic traits belonging to the same lineage, distinct from those of other isolates of the same species.

Clone: An isolate or a group of isolates descending from a common precursor strain by non-sexual reproduction exhibiting phenotypic or genotypic traits characterized by a strain typing method to belong to the same group.

Type, typing information, taxonomic unit, or marker: A specific and discrete unit of information or character belonging to a strain displayed upon application of a strain typing procedure. (e.g., antibiotic resistance pattern, serotype, electrophoretic banding pattern, allele).

Genotype: A genetic characteristic of a cell or an organism according to its entire genome or a specific set of genetic loci (allele).

Phenotype: An observable characteristic expressed by a cell or an organism, such as drug resistance, virulence, and morphology.

Biotype: A taxonomic unit of a microorganism belonging to one species based on a panel of biochemical tests.

Serotype: A taxonomic unit of a microorganism based on antigenic properties of a distinct set of surface structures.

Pathotype: A taxonomic unit of a microorganism that describes a pathogenic variant of a species or subspecies of microorganisms that normally colonize a host. Also called pathovar.

Allele: A nucleotide sequence variant of a gene at a particular locus.

Single nucleotide polymorphism (SNP): Small changes or variation in corresponding nucleotide sequences at a given locus (e.g., corresponding locus TTAGGTCCTTTA in strain 1 and TTTGGTCCTTTA in strain 2 differs by a single nucleotide in the third position).

Microsatellites: DNA sequences containing tandem repeats of a short sequence (e.g., CCG repeated 10 times in succession). Most often used in reference to such sequences in eukaryotic cells.

It should be emphasized at the outset that molecular biology methods are no more special than any other methods-laboratory or non-laboratory-used in epidemiology. Curiously, however, when biochemical laboratory methods are used to characterize organisms in an epidemiologic investigation, we do not use the term "biochemical epidemiology". For some reason, when molecular biology methods are used in an epidemiologic study, the study is described as "molecular epidemiology". This probably results from a peculiar human habit of glorifying certain characteristics over others. For instance, it is more fashionable in research universities to call a department doing the same type of research a "molecular biology" instead of a "microbiology" department. Molecular epidemiology is epidemiology. In truth, molecular biology methods are nothing more than one of many tools used in epidemiology to order data into separate units or strata so that they become amenable for different types of analytical procedures.

All laboratory-based methods used to type microorganisms (typing systems) fall into phenotypic and genotypic methods, as described by Maslow et al., (5) Tenover et al., (4) and Lipuma. (6) Most of the "conventional" laboratory typing methods fall into phenotypic methods, which are based on the detection of phenotypes or characteristics expressed by an organism. Strain typing methods based on genotypes rely on the analysis of nucleic acid contents and gene sequence polymorphisms (chromosomal DNA, extrachromosomal DNA, and RNA). Nowadays, with routine applications of molecular biology tests, the distinction between "conventional" and molecular biology laboratory techniques may no longer be meaningful. Many standard strain typing procedures today are built upon layers of tests that include both molecular and non-molecular biology techniques.

The first step in data stratification as applied to bacterial infectious disease agents is differentiating pathogens from normal commensal or saprophytic bacterial organisms. Commensal organisms reside, metabolize, and replicate in a healthy, immunocompetent host without causing disease. They colonize the skin, mouth, nasopharynx, gastrointestinal tract, vaginal mucosa, and others. A saprophyte is a type of microorganism that grows normally on dead matter, and hence is found in the environment. This initial step separates organisms into species. Microorganism characteristics used to define species include growth, morphologic, biochemical, serologic, functional, and physiologic properties.

3.1.1. Differentiation by Growth and Morphologic Characteristics. Differences in nutritional requirements of bacterial organisms determine their growth characteristics. They can be classified according to whether they grow at all or the way they grow on solid agar culture plates or in liquid media. Morphologically, they may be differentiated by their colony shape, color, texture, smell, or whether they "swarm" on the plate. They are also differentiated by the way they stain (e.g., Gram stain, acid fast stain) as observed under a light microscope, or their shape and size as observed under an electron microscope.

3.1.2. Typing Based on Biochemical Characteristics. Biochemical tests used to differentiate bacteria are based on the organism's metabolic activities. Strains differentiated by a panel of biochemical tests group into biotypes, and the tests are thus referred to as biotyping tests. The discriminatory power of a biotyping test depends on the number of biochemical tests used to subtype strains-the larger the number of biochemical characteristics analyzed in a pathogen, the greater the discriminatory power of the test.

The tests depend on an organism's metabolic activities, which are sometimes subject to minor changes in growth condition (e.g., nutrient concentration, pH, and temperature) used, and hence the information that is generated may not always be reliable for epidemiologic studies. In developed countries, these tests have come to rely on kits and automated systems, facilitating rapid turnover. However, the large number of tests often needed to subtype a strain renders biochemical classification of pathogens expensive.

Characteristics. Another frequently used phenotyping method is based on a characteristic antigen-specific serologic (antibody) response induced by an invading organism in a mammalian host. Bacterial antigens that induce antibody response used for serologic classification include proteins and polysaccharides. For example, the species Escherichia coli characterized biochemically can be further subtyped into a "serogroup", based on the organism's characteristic somatic antigen (O-polysaccharide). A numerical designation is assigned to an E. coli strain carrying a characteristic O-polysaccharide that induces an antibody response (e.g., O157). The same strain, if further subtyped based on its flagellar protein (H-antigen), is classified as a "serotype". It will then be given an O:H designation (e.g., O157:H7). The genus Salmonella is biochemically separated into species and subspecies. They are further subtyped serologically according to their O and H antigens. Unlike E. coli serotypes, Salmonella enterica subsp. enterica (subspecies I) serotypes are given a descriptive name rather than a numerical designation, e.g., S. enterica serotype (or serovar) Typhimurium or S. enterica serotype Enteritidis. Nearly 1,500 serotypes of S. enterica subspecies I are currently recognized. (7) The specific serogrouping or serotyping assays include bacterial agglutination test with a panel of antigen-specific antisera, enzyme-linked immunosorbant assay (ELISA), latex particle agglutination, Western blot (or immunoblot) analysis, or a modification of these tests.

One limitation of serotyping is that different strains from the same species or even strains from different species could carry cross-reacting antigens, which can result in falsepositive tests. On the other hand, strains that do not express typing antigens are classified as non-typeable (e.g., nontypeable Haemophilus influenzae). Serologic tests, on the most part, are simple to perform, but they require maintaining a stock of typing polyclonal or monoclonal antisera. For example, a large number (∼200) of antisera need to be used to serotype Salmonella. Thus, serotyping could be expensive and time consuming, and is usually confined to reference laboratories.

3.1.4. Typing by Functional or Physiologic Characteristics. Other ways to differentiate bacterial isolates involve examination of their response to specific manipulations. They include identifying differences in susceptibility to antimicrobial agents (antibiograms), lysis by or susceptibility to bacteriophages (phage typing), patterns of association with tissue culture cells (cell culture assays), toxigenicity, survival under in vitro or in vivo stress, and metabolic enzyme expressions (multilocus enzyme electrophoresis, or MLEE). These are reviewed in detail elsewhere. (1) 

Genotying or nucleic acid-based typing methods are ultimately based on the nucleic acid sequence differences in the chromosomal or extrachromosomal nucleic acid contents. As opposed to hundreds of ways to phenotype bacteria, all genotyping tests can be grouped into a variant of just three basic analytical procedures-(a) hybridization, (b) gel electrophoresis, and (c) nucleic acid sequencing. This allows for the use of common equipment and standard reagents to analyze many different types of infectious agents. Furthermore, genotypic characterizations of pathogens facilitate standard information storage and data analyses, interpretation, and communication, which are all amenable to computer-assisted manipulations.

In bacteria, genotyping analyses are based on comparison of two types of DNA: the genome and the extrachromosomal DNA. Typing systems based on the analysis of the genome fall into those designed to compare differences based on nucleotide changes that occur in parts of the genome (microdiversity) or the entire genome (macrodiversity). These typing methods, of course, had to be developed in place of direct whole genome DNA sequence analysis, which, despite recent advances in the sequencing technology, remains cumbersome and expensive for bacterial organisms (although no longer the case with viruses). Further detailed reviews on this topic are mentioned at the end of this chapter (Suggested Reading).

DNA Genotyping Methods. For bacteria plasmids are the extrachromosomal DNA elements used to type strains. Plasmids are covalently closed circular DNA that carry genes that may encode characteristics such as antibiotic resistance, virulence factors, or other functions that provide selective advantage to the organism but are not essential for its growth or survival. In fungi or protozoas, extrachromosomal elements used for typing may include mitochondria (8) and kinetoplasts (schizodeme analysis). (9) Plasmid-based typing is an example of an electrophoresis-based strain typing procedure. It compares differences in molecular weight (MW) and the number of different plasmids a bacterium may carry. Nucleic acids and proteins are negatively charged. Plasmids, thus, can be resolved electrophoretically in an agarose gel matrix according to MW. That is, the separation of these molecules depends on their drag through the gel matrix as they migrate toward the positive charge pole. Large MW plasmids migrate more slowly in the gel matrix than low MW plasmids. Thus, depending on the number of different MW plasmids, each strain represented by a single electrophoretic lane will generate a banding pattern, a "fingerprint", or a profile, that can then be compared across multiple lanes ( Figure 1) .

In one early study designed to validate the discriminatory power of plasmid profile analysis of bacterial pathogens, Holmberg et al. compared S. Typhimurium isolates from 20 well-documented outbreaks by phage types, antimicrobial resistance patterns, and plasmid profiles. (10) They found that among nine outbreaks in which non-outbreak isolates were also examined, the plasmid profile analysis was able to distinguish outbreak-related isolates from non-outbreakrelated isolates more frequently than the other methods (phage typing and antibiogram), suggesting that the plasmid profile analysis was more discriminatory than the other methods used to type strains at that time.

One drawback of the technique is that the procedure depends on the presence of at least one plasmid in a bacterium. If isolates carry only one or two plasmids, the patterns generated will not be sufficiently discriminating enough to distinguish the strains as related or unrelated.

Isolates that contain only one or two plasmids of identical MW need to have the plasmids digested with endonucleases (enzymes that cut DNA molecules at a specific sequence site; see details under "Restriction Endonuclease Analysis or REA") to generate smaller fragments that can then be electrophoretically resolved. Another potential problem with this method is that plasmids on prolonged storage of a bacterium can be spontaneously lost, which can affect reproducibility of the typing method. Although plasmid profiles are relatively easy to perform and interpret, it does require a laboratory equipped for basic molecular biology techniques, including DNA extraction, restriction endonuclease digestion, electrophoresis, and photodocumentation of the results.

In genomebased typing methods the entire genomic content of a microorganism is compared among different strains (macrodiversity analysis). There are three basic genome-based typing methods used in epidemiologic investigations, which are based on (1) allelic differences that occur within restriction endonuclease recognition sites in a genome, (2) comparison of multiple nucleotide sequences of whole genomes (called comparative genomics), and (3) whole genome hybridization patterns (see below for the description of hybridization) obtained from arrays of thousands of distinct short DNA fragments representing the whole genome fixed onto a glass slide or nitrocellulose filter ("gene chip" or microarray). The latter two methods have only recently begun to be applied to epidemiology.

Bacterial organisms express enzymes called restriction endonucleases that recognize unique sequences of several nucleotides in length and cut the DNA molecule at a specific position. They are called restriction enzymes because bacteria use these enzymes to eliminate (restrict) foreign DNA accidentally introduced into them. Thus one enzyme may generate distinct sets of DNA fragments from two different strains if the strains have a sequence difference at the corresponding restriction sites. Comparison of electrophoretic patterns based on the number and differences in MW of cut DNA fragments is called restriction endonuclease analysis (REA) or restriction fragment length polymorphism (RFLP) analysis. (11) (12) (13) Enzymes that cut DNA molecules at recognition sites that occur frequently throughout a microorganism's chromosome are colloquially referred to as "frequent cutters". Electrophoresis of DNA fragments generated from such enzymes will produce thousands of overlapping bands upon gel electrophoresis, and the resulting patterns become difficult to interpret ( Figure 2A ). Two different ways have been developed to overcome this limitation, as outlined below.

Southern blot hybridization analysis, named after its developer, (13) is a type of analysis that facilitates detection of a particular DNA fragment of interest among hundreds of other DNA fragments generated by REA. DNA fragments separated in an agarose gel are first transferred onto a piece of nitrocellulose or nylon membrane. The membrane is then exposed to a piece of single-stranded DNA (probe) labeled with a molecule that facilitates visual detection of a selected target DNA fragment. The probe will specifically bind (hybridize) to its complementary DNA sequence embedded in the membrane. For example, a nucleic acid sequence or probe ATCGGCGATG will hybridize to a target containing the sequence TAGC-CGCTAC. Thus, this probe-based detection of target DNA fragments is called hybridization. If the organism of interest harbors more than one copy of the target DNA sequence in different locations in the chromosome, the membrane will show multiple bands (banding pattern) in a single lane, which can be visually detected. Of course, the number of such visualizable bands will be few and will not be obscured by the other fragments in the membrane, which cannot be visualized because they do not hybridize to the probe. This allows for comparison and interpretation of the band patterns ( Figure 2B ). This method has been applied to subtype a wide variety of bacterial organisms, and is the current standard method used to type M. tuberculosis, in which the organism's repetitive DNA element or insertion element called IS6110 is used as the target for a hybridization probe (called IS6110 RFLP). (14) One limitation of this approach is that the DNA sequence selected as a target for a hybridization probe must be present in the chromosome in high copy numbers (usually >5) for the technique to be sufficiently discriminating. Clinical isolates of M. tuberculosis completely lacking or having small copy numbers of IS6110 have been found. (14) In such situations, secondary typing methods must be applied. Another drawback of the RFLP analysis followed by hybridization is that it requires a specialized reference or research laboratory to perform.

When the hybridization typing method uses ribosomal genes (rrn) as a target, it is called ribotyping. Ribosomal RNA sequences are highly conserved, especially within members of the same genus. This property can be used, then, to type a wide range of organisms related at the species level, but by the same token, the discriminatory power of ribotyping is low. One study found that E. coli isolates obtained from widely separate geographic areas shared identical ribotypes. (15) Thus, for E. coli, ribotyping would not be discriminating enough for local epidemiologic studies or regional surveillance, although it may be useful for studying evolutionary relationships.

The second way developed to overcome the problem of uninterpretable electrophoretic patterns generated by RFLP is to use endonucleases that recognize rare restriction site sequences ("rare cutters"). Such enzymes will produce very large DNA fragments, which, when resolved electrophoretically, will generate interpretable patterns. However, electrophoresis in agarose gels of large pieces of DNA introduces two additional technical constraints. Linear pieces of DNA ≥25,000 bases (25 kilobases or kb) are difficult to resolve by the conventional agarose gel electrophoresis method. They get stuck during migration across the gel matrix. Thus, PFGE was developed in which the orientation of the electrical field across the gel is changed (pulsed) periodically. (16) This allows the DNA fragments to move in different directions, releasing them from the stuck position. The DNA fragment will thus take a path of least resistance along its gel migration toward the positive charge pole.

The second technical obstacle that had to be overcome was the tendency for large DNA fragments to undergo spontaneous autodigestion. This problem was eventually solved by enzymatically lysing and digesting the cell wall and proteins of the test bacterial organism embedded in an agarose gel plug. This releases the bacterial DNA directly into the plug. The released DNA can then be digested within the plug with a restriction enzyme. This prevents autodigestion.

The major advantage of PFGE is that virtually all genome-containing organisms are typeable by this method. In general, genetic changes detected by PFGE reflect genetic events in an organism that occurred in the recent past. Hence, it is a highly informative technique with wide epidemiologic appeal. PFGE is the method of choice used by a national foodborne disease surveillance system called PulseNet, which is collaboration among Centers for Disease Control and Prevention (CDC), Association of Public Health Laboratories (APHL), and the food regulatory agencies, including the United States Department of Agriculture (USDA) and the Food and Drug Administration (FDA). Foodborne pathogens included in the PulseNet surveillance currently include Salmonella Typhimurium, E. coli O157:H7, Listeria monocytogenes, Campylobacter jejuni, and Shigella sonnei (17) (www.cdc.gov/pulsenet). PulseNet has contributed to detecting, investigating, and controlling multistate outbreaks of E. coli O157:H7, (18) Salmonella Agona infections, (19) and listeriosis. (20) The network will expand to PulseNet Canada, PulseNet Europe, PulseNet Asia Pacific, and PulseNet South America.

One major limitation of PFGE is that it is technically demanding, requiring an expensive electrophoresis apparatus. Thus, this typing system is currently used only by reference or research laboratories.

3.2.6. Whole-Genome Sequence Comparison. The reduced cost and ability to rapidly sequence large pieces of DNA engendered a new discipline called comparative genomics. (21, 22) With viruses, whose genomes are considerably smaller than those of bacteria, comparative genomics is already applied to directly genotype clinical isolates. That is, hundreds of genomes of viral strains belonging to one species can now be compared rapidly to each other and applied to address epidemiologic questions. For example, it was the comparative genomics analyses that enabled investigators to conclude that the agent of severe acute respiratory syndrome (SARS) belonged to a group of coronaviruses. (23) With bacteria, because of the large genome, such comparisons are still not possible for a large number of strains. That is, hundreds of genomes of a single bacterial species cannot be compared because they have not been sequenced. However, new information obtained from comparison of a limited number of whole genome sequence databases is used to develop alternative genotyping methods. For example, at least eight different clinical isolates of Staphylococcus aureus have now undergone whole genome sequence analysis. Multiple alignment comparisons reveal several regions of difference. One such region called RD13 containing a cluster of Staphylococcus exotoxin (set) genes has been recently exploited to develop a new genotyping method called exotoxin sequence typing. (24) Whole genome sequence comparison of M. tuberculosis H37Rv, CDC1551, strain 210, and Mycobacterium bovis AF2122/97 led to the identification of 212 single nucleotide polymorphism (SNP) markers. (25) More recently, an analysis of these SNP markers in a global collection of 294 M. tuberculosis strains identified six phylogenetically distinct SNP cluster groups (SCGs). SCGs were strongly associated with geographic origin of these isolates. (26) 3.2.7. Microarray Comparisons. Another way to compare whole genomes is based on hybridization patterns generated from thousands of short pieces of DNA (probes) of known sequence representing the genome, arranged on a glass slide or nitrocellulose membrane, often referred to as microarray or "gene chip". Microarrays are a type of DNA hybridization matrix typing method described previously. The hybridization technology is not new, and the microarray technology is just a recent variant of the technology made possible by miniaturization techniques developed in the semiconductor industry. (27) (28) (29) Most microarray-based applications to infectious disease currently focus on taxonomic differentiation or diagnosis, and not on epidemiologic investigations. In one study, a whole genome DNA microarray analysis of 13 vaccine strains of M. bovis-BCG found 16 regions deleted in BCG strains relative to the sequenced M. tuberculosis H37Rv genome. (30) Kato-Maeda et al. found 25 deleted regions among 19 clinical M. tuberculosis isolates by a whole genome microarray approach. (31) Here, they reported that there was an inverse relationship between the proportion of deleted regions in these strains and the percentage of patients with cavitary disease infected with these strains. The discovery that M. tuberculosis contained such regions of deletion led some investigators to propose a genotyping scheme called "deligotyping". (32) In this method, clinical isolates of M. tuberculosis are compared according to differences in the regions of deletions.

Microarrays facilitate comparison of a large number of strains more rapidly than the whole genome sequence comparisons, but they are dependent on the number and types of DNA fragments that are included in the array, and hence will not be as discriminatory as the latter. Furthermore, they require specialized laboratories equipped with a scanner to read the microarray and expensive software to interpret the hybridization patterns on the chip. Thus, this technology is not likely to become readily available in the near future in most places to conduct epidemiologic investigations.

All of the genotyping methods described above are based on DNA extracted from bacteria that are cultivated in vitro. Culturing bacteria serves two important purposes-(1) makes available large amounts of DNA for the genotype analysis and (2) assures that the genetic material being analyzed represents a distinct bacterial strain, obtained from pure culture.

For bacterial organisms that grow very slowly (M. tuberculosis) or cannot be cultivated artificially (Treponema pallidum), the above genotyping methods would be difficult to apply. Improper storage of bacterial isolates may also cause them to die, in which case genotyping by any of the above methods will not be possible. Polymerase chain reaction (PCR)-based tests can overcome these limitations. The PCR technology is a powerful and simple molecular biology tool, which is now applied in a variety of ways to genotype microorganisms in taxonomic as well as epidemiologic studies. It should be noted that PCR is only one of several methods called nucleic acid amplification tests (NAAT) used to make multiple copies of a target DNA or RNA of an organism or cells in vitro without culturing the organism or cells. PCR is perhaps the most frequently used NAAT and the most common method applied to subtype microorganisms. There are many excellent books that detail the procedure and are listed at the end of this chapter (Suggested Reading).

There are a number of frequently used terms that the reader should become familiar with in the discussion of PCR tests. Those familiar with the procedure may skip this section.

PCR involves multiple cycles of a three-step process: (1) denaturation of double-stranded DNA into single strands, (2) annealing of oligonucleotide primers to target DNA sequences called a template, and (3) primer extension along the template sequence from the primer binding site. These steps are carried out in a plastic microcentrifuge tube containing a reaction mixture of buffers, nucleotides, primers, heat-stable polymerase, and a sample DNA template. DNA when heated to a sufficiently high temperature will separate into single strands. Separation into single strands of a doublestranded DNA is called denaturation. Thus, in the first step, the sample DNA (template) is usually heated to 95-100 • C. Primers are synthetic single strands of DNA (oligonucleotides) of variable lengths designed to bind (anneal) to specific complementary target sites in a DNA template. They can be custom-made commercially. Primers bind to the target sequences when the temperature of the reaction mixture is decreased from a denaturation temperature to an annealing temperature range (∼40-60 • C). At about 70-75 • C (primer extension temperature), a thermostable enzyme called DNA polymerase will add nucleotides (extension) complementary to the unpaired DNA strand onto the annealed primer. In this way, a DNA segment located between the two annealed primers will be copied. This three-step cycle involves change in temperature only-denaturation, annealing, and extension temperature-which is facilitated automatically by a programmable heating device called thermocycler. Depending on the time applied to each of the steps in the cycle, the type of thermocycler used, and quality of the polymerase, a single copy of DNA can be multiplied to more than a billion copies after 30 cycles in less than 3 h. The amplified product can then be visualized in an electrophoretic gel under UV illumination after staining with ethidium bromide.

For PCR tests, the cultivation of organisms is usually needed not so much to make enough nucleic acid material available for analyses, but to assure that the product being analyzed represents a single or pure strain of an organism. For bacteria, if a clinical specimen is obtained from a normally sterile body site such as blood, cerebrospinal fluid, or deep tissue, and it is obtained in a sterile fashion, PCR assay may be performed directly on the specimen itself. The primer sequences could be designed to bind to sequences that are specific to a target organism (e.g., IS6110 of M. tuberculosis). In this way, DNA targets in an organism infecting nonsterile sites (sputum, urine, stool) could be amplified. Blood, urine, and other tissues may contain products inhibitory to PCR, but if the specimen contains a sufficiently large number of the infectious agent of interest, the desired target DNA could be amplified and detected. If a stored organism becomes contaminated with other organisms, or lose their viability, a PCR test can still amplify DNA from that organism, as long as the DNA material is intact. If the primers are highly species specific, a PCR test can distinguish the target sequences from DNA derived from contaminating organisms, thus precluding culturing the target organism from a specimen.

Methods. PCRbased genotyping methods that compare electrophoretic bands are classified into two general categories: (1) those based on MW polymorphism of a single amplified product and (2) those that display band patterns (fingerprints) from multiple amplified products. The former method is often used to type organisms to predict species, (33) (34) (35) serogroups, serotypes, (34, 36, 37) or pathotypes. (37) The major advantage is that this PCR-based method is considerably simpler than the conventional methods used to group them into species, serotypes, or pathotypes, but its discriminatory power is limited to differentiating organisms belonging to a large taxonomic unit. It does not discriminate beyond these taxonomic units.

More discriminating PCR-based typing methods require generation of fingerprints. These latter techniques can be divided into three major groups (Table 1) : (1) those that rely on sequence polymorphism in the whole genome, (2) those based on heterogeneity within known restriction endonuclease recognition sites, and (3) those that take advantage of repetitive elements interspersed in the target genome.

One relatively new category of PCR-based typing methods is based on comparison of sequences of PCR products of selected gene targets instead of comparison of band patterns by electrophoresis. One example of such a method is called multilocus sequence typing (MLST), discussed below. All of the above methods are described in detail in reference (1) . Methods that are most frequently used in epidemiologic investigations are described below.

3.2.11. PCR Typing Methods Based on Sequence Polymorphism in the Whole Genome. Arbitrarily primed PCR (AP-PCR), also called randomly amplified polymorphic DNA (RAPD), are PCR techniques that rely on arbitrarily or randomly designed sequences of primers that anneal specifically to DNA templates in a target organism. (38) (39) (40) If a set of these randomly designed primers anneal to target DNA sequences, the intervening segments that are proximal enough to these annealing sites can be amplified and generate products of different MW. Such products will display a band pattern in an electrophoresis gel for each strain. The advantage of this method is that the target nucleotide sequence does not have to be known. The primer sequences are determined empirically until a set that works is found. The final choice of the primer design is based on the generation of reproducible and interpretable band patterns. The major drawback of this test, however, is its reproducibility. Because this technique is based on random sequences of the primers, there is always a risk that contaminating DNA may get amplified. To minimize these nonspecific amplification, the following precautions are taken: (1) DNA templates should be from organisms recovered in pure culture, (2) include a sample in which no template is added (negative control), (3) standardize template DNA concentration in all samples to be tested, and (4) if possible, test all the desired samples in one batch.

The same principles behind REA, RFLP, and PFGE analyses of genomes of pathogens can be applied to PCR products. A PCR product, if it is sufficiently large, 1,000-2,000 base pairs or bp, can be digested with an endonuclease, and the generated DNA fragments can be resolved by gel electrophoresis and visualized. If such amplified products contain nucleotide changes at the restriction sites, varied electrophoretic band patterns will result from the digested PCR products. One drawback of this PCR-REA (or PCR-RFLP analysis) method is that part of the target sequence to be amplified has to be known ahead of time.

Repetitive DNA sequence elements are found in both prokaryotic and eukaryotic chromosomes. A variety of repetitive DNA elements have been identified in bacterial pathogens, which have been exploited to develop PCR-based strain typing assays. All of these procedures use the same approach, in which primers are designed to anneal to target repetitive sequences in the outward direction (outward primers) near the ends of these repetitive elements. The newly synthesized DNA strand extends away into the nonrepetitive sequences. Thus, sequences located between the elements are amplified. Because these elements are interspersed in the chromosome at different locations between species or strains, the amplified products will be of variable number and MW. The size of the amplified products will depend on the distance between the repetitive elements and the quality of the polymerase, and other PCR conditions used. Of course, longer the distance between the repetitive elements, harder it is to amplify such a region. Table 1 lists some of the major bacterial repetitive DNA elements that have been used for strain typing. One frequently used repetitive element genotyping method is called enterobacterial repetitive intergenic consensus (ERIC) sequence PCR. (41, 42) ERIC sequences are dispersed throughout the bacterial chromosome, especially of Gram-negative bacteria, and are located in extragenic regions. (41, 43) The exact function of ERIC sequences is not known.

Another repetitive element used for typing by PCR is called BOX, first reported in Gram-positive bacterial organisms. (44) BOX elements include mosaic combinations of three subunits, boxA (59 bp), boxB (45 bp), and boxC (50 bp), (44) which are all found in Streptococcus pneumoniae. Gram-negative organisms such as E. coli and S. typhimurium contain boxA-like sequences, but not boxB or boxC sequences. BOX elements have been widely applied to characterize the epidemiology of pneumococcal diseases. (45) (46) (47) (48) 3.2.14. Insertion Sequence (IS)-Based PCR. One potential disadvantage of PCR tests based on cross-species, widely distributed repetitive elements like ERIC and BOX, is the possibility for a nonspecific amplification when DNA from one organism is contaminated with DNA from another. In some bacterial species, this potential problem can be overcome by techniques that target repetitive elements that are unique to the species.

Bacterial organisms carry certain genetic elements that "jump", translocate, or transpose to new locations in the chromosome. Transposable elements that carry no genetic information except that needed by the elements to insert themselves into a new site are called simple transposons or insertion sequences (IS). Some insertion sequences are found across species while others are species specific. If the copy number of an IS element in a bacterial strain is high enough and if it is randomly distributed in the chromosome, DNA sequences between these elements can be amplified, as described in repetitive element PCR assays above.

One example of a widely targeted IS element used to type a bacterial organism is IS6110 of M. tuberculosis. IS6110 is an IS unique to the members of the M. tuberculosis complex (M. tuberculosis, M. bovis, M. microti, and M. africanum). IS6110 has been used as a target as well as a probe for RFLP-Southern blot fingerprint analysis of M. tuberculosis strains. Ross and Dwyer reported a PCR-based method designed to amplify sequences between IS6110 elements to generate DNA fingerprint patterns. (49) Of course, for IS6110-PCR-based typing method to work, the IS element has to be present in large numbers in M. tuberculosis. Thus, methods that take advantage of other different interspersed short repetitive DNA elements present in M. tuberculosis have been developed. There are at least four such short repetitive elements used to perform PCRbased genotyping of M. tuberculosis-(1) the polymorphic GC-rich tandem repeat sequence (PGRS), (2) the major polymorphic tandem repeat (MPTR) loci, (3) the exact tandem repeat (ETR) loci, and (4) the direct repeat (DR) elements located at a single locus in M. tuberculosis complex strains (Table 1) .

PGRS elements together with IS6110 were exploited for a PCR-based typing of M. tuberculosis strains by a method called double-repetitive-element (DRE) PCR. (50) The greater number of repetitive elements represented by PGRS and IS6110 increases the chance of amplifying sequences between these elements. When the DRE-PCR method was compared to the standardized IS6110 RFLP analysis of M. tuberculosis strains during a TB outbreak in a prison in Havana, Cuba, the two methods identified the same clusters of TB patients who were clearly linked epidemiologically in the prison, which validated the PCR-based test for epidemiologic application. (51) Like most PCR-based genotyping tests, the DRE-PCR test suffers from low reproducibility. However, because of its simplicity, the DRE-PCR assay may be useful for screening a large number of isolates prior to using a more labor-intensive but discriminating test such as IS6110-RFLP.

The variable numbers of tandem repeats (VNTR) found in the MPTR and ETR loci in M. tuberculosis have also been exploited to type this pathogen. (52) (53) (54) In M. tuberculosis,the MPTR contains 15-bp repeats with a single consensus sequence and high sequence variability between these repeats. (55, 56) Each ETR locus contains large tandem repeats with identical repeat DNA sequences and each locus has a unique repeat sequence. (53) Thus, the number of these repeat units at each MPTR or ETR locus can be predicted from the size of the PCR products, which becomes the basis for VNTR genotyping. VNTR genotyping has been applied to a variety of bacterial organisms. (57) (58) (59) A recent variant of this method applied to type M. tuberculosis is called mycobacterial interspersed repetitive unit (MIRU) VNTR, which is based on microsatellite-like sequences scattered throughout the M. tuberculosis chromosome. (60, 61) By using 12 of these MIRU loci, several investigators have demonstrated that the resolution achieved by this typing method is comparable to that of IS6110-RFLP typing method. (62) (63) (64) Another PCR-based method used to type M. tuberculosis is based on a region called direct repeat (DR) locus. This locus contains multiple copies of a repeat 36 bp sequence interspersed with unique spacer sequences which are 34-41 bp long. (55, 65) A typing method based on this variation in the spacer sequences is called spoligotyping (spacer oligotyping). (66) Oligonucleotides representing the spacers are synthetically made and covalently attached to a piece of membrane. This membrane then is used as a target for hybridization against PCR products obtained from amplification of the spacer sequences located between the DRs. PCR products used as probes are generated from the total genome of a strain of M. tuberculosis by one set of primers designed to amplify spaces between the DRs in the test strain. If the membrane contains an oligonucleotide representing a spacer sequence also present in the test strain, an amplified product will hybridize to it. This will generate a hybridization matrix, not unlike the microarray discussed earlier. Thus, unlike the previously described PCR-based typing methods, the patterns that are generated and compared are not electrophoretic fingerprints.

One important advantage of spoligotyping over the standardized IS6110-based method (14) is that it reliably differentiates M. tuberculosis from M. bovis, two members of the M. tuberculosis complex. (66) This is because of the low copy number of IS6110 in M. bovis. Discrimination of clinical isolates of M. tuberculosis by spoligotyping followed by DRE-PCR performed in French West Indies has been found to be almost as discriminating as the reference IS6110-RFLP method. (67) The spoligotyping membrane is commercially available. Thus, any laboratory capable of performing PCR should be able to characterize M. tuberculosis isolates by spoligotyping.

Many other PCR-based genotyping methods for bacterial pathogens have been described. They all use variations of the general theme described above (Table 1) . One relatively new method called multilocus sequence typing (MLST) combines PCR assay with sequencing to assign a sequence type designation to a strain. Standardized MLST analysis for most bacterial organisms is based on sequencing a PCR-amplified region from each of a set of unlinked housekeeping genes. For example, with S. aureus, these genes include seven genes: arcC, aroE, glpF, gmk, pta, tpi, and yqiL. (68, 69) Each of the amplified segments from these genes shows potential allelic variation in different S. aureus isolates. An MLST clone that is designated as a particular sequence type (ST) is defined as a strain belonging to a set of isolates with identical sequences at each of the seven loci. A methicillin-resistant S. aureus (MRSA) strain designated ST8 (also called PFGE type USA300) has recently been reported to cause widespread community outbreaks of severe skin and deep tissue infections in different regions of the United States. (70, 71) Because this is a sequence-based method, it is highly reproducible across different laboratories, and thus standardized protocols have been developed for several bacterial organisms. Protocols as well as ST designations for these organisms are maintained and accessible at http://www.mlst.net.

At least six common epidemiologic concerns related to bacterial infectious diseases are addressed by genotyping tests. They are (1) characterizing distribution and dynamics of disease transmission in geographically widespread areas;

(2) identifying and quantifying risk in sporadic occurrence of infectious diseases; (3) stratifying data to refine epidemiologic study designs; (4) distinguishing pathovars from nonpathovars; (5) addressing hospital and institutional infectious disease problems; and (6) identifying genetic determinants of disease transmission. These are discussed separately below.

Much of the literature that applies molecular biology tools to genotype bacterial pathogens addresses this epidemiologic issue-characterizing geographic distribution of clones or clonal groups. Of course, this is an important component of epidemiology, but unfortunately these types of studies are somewhat limited in their ability to provide good understanding of the transmission dynamics of an infectious disease that yields practical solutions to disease prevention and control. Very often, these studies use convenient samples, in which available collection of microorganisms is genotyped first and questions are asked later. A typical question that is asked afterward is "What do we do with the strain type data?" When no other types of data are collected (i.e., demographic characteristics of the human subjects from which the isolates are obtained, exposure measures, clinical features, etc.) the analysis of the isolates is limited to just describing the number of clones, the proportion that each clone contributes to the study collection, speculations about which clone descended from which, and their geographic distribution. Unfortunately, these types of data contribute little to public health.

For molecular epidemiology to be truly relevant, the right questions need to be asked first before any strains are genotyped. Tracking a strain from place to place and over time is one application of molecular biology techniques, but such an application is not sufficient to address important and relevant questions about disease transmission. In infectious disease epidemiology, researchers and practitioners ask the following types of questions (1) : How does an organism get introduced into a community, and how and why does it spread? What are the risk factors for infection and transmission in a community and how is the infection maintained in a community of hosts? How and why does a particular strain suddenly emerge to become the predominant strain in a community, and why does it disappear? How is drug-resistant infectious disease prevalence in a community affected by fluctuations in clonal distribution over time in that community? How is the prevalence of spectrum of clinical manifestations (mortality, severe disease, asymptomatic infection) of an infectious disease affected by clonal distribution of a pathogen in a community? Do unique biologic characteristics of a member of a bacterial species contribute to enhanced transmission and hence increased prevalence of that member in a community or institution? How do we use the knowledge we gain about the dynamics of disease transmission to implement effective disease control programs? How do we measure the effect and impact of the intervention made based on the knowledge acquired from the use of a particular laboratory test? Once these questions are asked, one can then decide how to collect the isolates, which collection of isolates should be genotyped, what comparison isolates (controls) should be collected, and which of the many strain typing techniques should be used.

When a disease occurs as part of a recognized outbreak, it is possible to identify vehicles or risk factors for that illness. This is because in an outbreak, an analytical study (case-control, cross-sectional, or cohort) can be designed to implicate a source or reservoir of an infectious agent responsible for the outbreak. If a risk factor or a vehicle of an outbreak is identified, the number of cases in that outbreak attributable to that risk factor (attributable risk fraction) can be calculated. In an outbreak, this would be 100% minus the background disease prevalence for that place. Unfortunately, in places like the United States, it is estimated from surveillance reports that less than 20% of reported cases of foodborne diseases, such as salmonellosis, come from recognized outbreaks. (72) The others are reported from sporadic cases. It is extremely difficult, if not impossible, to identify risk or estimate attributable risk fraction for sporadic diseases. The ability to genotype bacterial isolates from sporadic disease cases provides an opportunity to identify risks and calculate attributable fractions for those risks. One early example of such a study follows.

In late summer of 1981, case-control studies of two different salmonellosis outbreaks in New Jersey and Pennsylvania caused by S. Newport and S. Typhimurium implicated precooked roast beef produced at a meat processing plant in Philadelphia as the vehicle. (73) The distribution of the contaminated beef, of course, was not limited to just these two outbreak settings. It was distributed statewide in both New Jersey and Pennsylvania. The question that was asked was, "what proportion of the salmonellosis cases in these two states during this period that were not part of the two recognized outbreaks were caused by consumption of the contaminated precooked roast beef?" That is, the question asked for the proportion of all sporadic cases of salmonellosis in the two states attributable to the contaminated precooked roast beef.

None of the S. typhimurium isolates contained any plasmids, so this technique could not be applied to further subtype this Salmonella serotype. S. newport isolates from the two outbreaks and an isolate from the implicated meat product, however, contained identical 3.7 and 3.4 MDa plasmids. (74) S. newport isolates from persons in the two states not involved in the outbreaks were obtained from the respective state health department laboratories from the summer of 1981. In addition, S. Newport isolates from the same states before and after this summer were from other states during the same time period (geographic control), was obtained and analyzed. Telephone interviews of the people who got sick during the summer of 1981 revealed that a significantly greater proportion of those infected with a strain containing the 3.7 and 3.4 MDa plasmids ate precooked roast beef shortly before they developed diarrhea. (74) In the summer of 1981, nearly 45% of the S. Newport isolates from sporadic cases of salmonellosis in New Jersey and Pennsylvania belonged to this strain. Thus, the fraction of all sporadic S. Newport diarrhea cases during this period that could be attributed to this single contaminated product was estimated to be 45%.

This study demonstrated that the incidence of diseaseslike salmonellosis in a widespread geographic region could be rapidly and greatly affected by the introduction of a single contaminated food product. In fact, even after recall of the contaminated product, S. Newport strains associated with the summer outbreaks were isolated from two persons. It turns out these two cases occurred in siblings of an older child who had acquired the infection during the outbreak period after ingesting the beef. (74) Thus, salmonellosis in a community could be maintained by person-to-person transmission after introduction into that community by a contaminated product. The plasmid profile analysis, thus, helped not only to estimate an attributable fraction of a risk factor but also to describe the dynamics of community transmission of salmonellosis that would have been difficult, if not impossible, to do by conventional epidemiologic methods.

Stratification of data is one of the most common analytical procedures used in epidemiology to reduce bias after data had already been collected. It helps to determine more precisely an association of an attribute with disease or outcome, controlling for the effects of a larger stratum to which that attribute belongs. (1) Because molecular strain typing methods can group data related to infectious agents by subtypes, the strain subtype information itself can be used to develop a new case definition, which may comprise one stratum.

For example, in a chest clinic in a large city, patients from whom M. tuberculosis is isolated would all be grouped as culture-positive tuberculosis (TB) patients. If one were to do a study to identify risk factors for TB, the initial case definition of TB may be patients from whom M. tuberculosis is isolated. Here, an analytical study to identify risk factor for TB is likely to identify host-related factors, such as immunosuppression, old age, diabetes, and poverty. However, if the M. tuberculosis isolates from different patients can be tested for drug susceptibility, then these TB patients may be reclassified, according to the drug resistance of their isolates. Such TB patients may then be regrouped into two new stratathose with multidrug-resistant TB (MDRTB) and those with drug-susceptible TB. Each patient stratum (MDRTB versus drug-susceptible TB) can then be used to define cases or controls or comparison group in a case-control analysis to identify a new risk factor (e.g., risk for MDRTB). Such an analysis will find risk factors that are more specific than the risk factors identified for TB above. They may include, for example, previous history of TB treatment.

Then, if the MDRTB M. tuberculosis isolates from above are genotyped and the collection of these isolates are found to be comprised of one clone responsible for 40% of that collection, one can then look for risk factors for infection with that clone. Persons infected with that clone may share a common exposure, such as being exposed to a co-worker with TB at an office. The fact that 40% of them belonged to a cluster also suggests that the people infected with this clone were recently exposed and that they did not result from an infection that occurred many years ago.

One reported study illustrates this idea. (75) In the 1970s and 1980s in São Paulo, Brazil, multidrug-resistant strains of Salmonella serotype Typhimurium were a common cause of childhood diarrhea as well as meningitis. (76) The sources of these Salmonella organisms were unknown. In a multiclinic study of acute diarrhea in children less than 6 years of age in the city in the early 1980s, Salmonella were isolated from 40 (11%) of 358 children with diarrhea, 22 of which were S. Typhimurium; all were multidrug resistant. A case-control study comparing children infected with S. Typhimurium versus other serotypes of Salmonella found recent hospitalization to be a risk factor for S. Typhimurium infection. In the study, children who came to the study clinics with diarrhea developed diarrhea at home. Children with history of recent hospitalization were previously hospitalized for illnesses such as pneumonia and acute respiratory infections that required antimicrobial drug treatment. Because it was known at that time that recent exposure to penicillins or antimicrobial agents was a risk factor for acquisition of multidrug-resistant salmonellosis, (77) the investigators were faced with the question of whether the children with S. Typhimurium salmonellosis got infected at home or during the previous hospitalization.

The investigators examined plasmid profile of the S. Typhimurium isolates. They identified four distinct plasmid profiles that accounted for 47% of the 22 isolates (Figure 1 ). Each profile was found to cluster in time and in a particular hospital in the city. The others spread over a 2-year period in a typical endemic pattern of occurrence. Thus, 47% of the S. Typhimurium infections could be traced to particular hospitals in the city. (77) Furthermore, the study showed that what appeared to be an endemic pattern of disease occurrence in a community was actually composed of at least four outbreaks that occurred in hospitals, unrecognized until the isolates were genotyped.

A similar observation was made by a more recent study using PFGE to subtype S. Typhimurium isolates from "sporadic" cases of salmonellosis in Minnesota between 1994 and 1998. (78) Of 174 distinct PFGE patterns found among 958 isolates, the investigators detected four large outbreaks in the state that had not been previously recognized. (78) They also identified many multidrug-resistant strains that had been previously described among S. Typhimurium isolates obtained by CDC in the 1970s-1980s.

Thus, in the above two examples of salmonellosis, strain typing analyses (plasmid profile or the PFGE strain typing) helped to "unmask" several discrete outbreaks from a background of so-called sporadic or "endemic" cases of salmonellosis in a city and state. Endemic salmonellosis appears to be comprised of many small outbreaks. This is one of the major discoveries about infectious disease epidemiology made in the last 25 years by the application of molecular biology tools. In fact, it challenges the traditional view about infectious disease occurrence and raises a rather provocative suggestion that there is no such thing as endemic salmonellosis; all cases of salmonellosis may actually be part of outbreaks that go undetected because of absence of strain type information.

Thus, as illustrated above, different laboratory tests identified distinct strata, each of which then could be used to define a new group (i.e., new case definition). Then each stratum could be compared against others in a case-control study design to identify a risk factor for one group. Genotyping data provide the most refined stratum, and if the sample size does not get too small, an analysis can be performed to identify a specific risk factor for that stratum. More refined the stratum, more specifically the risk factor for that stratum is identified. Furthermore, stratification by genotyping methods identifies risk factors related to the pathogen itself-specific site of transmission of a specific clone in the example of salmonellosis in São Paulo, Brazil. Thus, genotyping creates an opportunity to do an analytical study that generates a new hypothesis-refines a study design.

Most importantly, from a public health perspective, a study design that can identify risks for a refined stratum can also identify a more focused intervention option. Stratification of cases, for instance, by antibiograms of the case isolates usually identifies host-related factors as risk factors-e.g., previous exposure to antibiotics. This knowledge, while helpful, does not provide a focused intervention option, and often, as we all know, such knowledge does not contribute to long-lasting effective interventions. In the example of salmonellosis in Brazil, however, further stratification of the Salmonella isolates by plasmid profile analysis identified a more specific risk factor related to the isolates themselves. The knowledge that many cases of multidrugresistant salmonellosis were occurring in specific hospitals in the city brought attention to the need to identify factors within the hospitals that contribute to Salmonella transmission. Today, nosocomial transmission of multidrug-resistant salmonellosis in São Paulo is rare. Thus, this ability to create an opportunity to identify specific intervention options is another major contribution to infectious disease epidemiology made by molecular epidemiology. Many examples of such applications are described elsewhere. (1) 

E. coli can cause diarrhea, urinary tract infection (UTI), meningitis, septicemia, septic shock, and wound infections. Yet there are E. coli strains that colonize our intestine without ever causing disease. Among E. coli strains that cause diarrhea, there are several distinct groups, and they differ from those that cause UTI, meningitis, septicemia, and other diseases. Thus, being able to distinguish E. coli strains associated with a particular type of disease, and separating them from non-pathogenic variants of E. coli, is an important activity in epidemiologic investigation. This way of classifying strains is called virulence or pathogen typing. E. coli strains that cause disease are the best examples of human pathotype or pathovar.

While traditional methods are also available to separate E. coli into distinct virulence types, molecular biology techniques have greatly simplified this task. In particular, E. coli strains associated with diarrhea have come to be classified into distinct groups (enteropathogenic E. coli, enterotoxigenic E. coli, enteroinvasive E. coli, enterohemorrhagic E. coli, shigatoxin-producing E. coli, enteroaggregative E. coli, diffuse-adherent E. coli), based on a unique set of factors that define the group. (79) However, it should be noted that what made possible the attribution of these virulence factors to these groups of E. coli is that many of these groups have been recognized to cause outbreaks, either at the community level or in institutional settings.

In obvious outbreaks, it is not necessary to genotype the implicated pathogen to claim that there was an outbreak. In fact, it is the other way around. Outbreaks are used to validate a genotyping test itself. If a particular genotyping method shows that a strain maker is indistinguishable in all the isolates from an outbreak, but that it differs from comparison isolates obtained from sources unrelated to that outbreak, that method is said to be validated for epidemiologic investigation of that organism. That is, the method can be used to study apparent sporadic diseases caused by that organism.

However, there is another group of E. coli that causes extraintestinal infections (sometimes referred to as ExPEC, or extraintestinal pathogenic E. coli), which are not as obvious as to why they cause disease. (80) That is, it is not clear whether these E. coli strains are members of the normal intestinal flora that cause disease only when they happen to enter a sterile niche, such as the blood stream, bladder, CSF, or deep tissue. Unlike the groups of E. coli that cause diarrhea, ExPECs do not carry any virulence factors that clearly distinguish them from those strains isolated from the intestine. Unlike diarrheagenic E. coli, ExPECs are not usually recognized to be associated with obvious outbreaks. This makes attribution of a clone of ExPEC to disease more difficult.

Thus, with ExPECs, the application of genotyping methods to address their epidemiology uses an approach distinct from that used to study diarrheagenic E. coli. A recent set of studies of community-acquired urinary tract infections (UTI) suggest that outbreaks of UTI do occur in community settings. (81) (82) (83) (84) Such observation was used by investigators to postulate that some of the multidrug-resistant uropathogenic E. coli strains causing UTI in women may have food animal as a reservoir. (81, (85) (86) (87) In diseases such as UTI where an outbreak occurrence is not obvious, a genotyping method itself may provide a clue that an outbreak may have occurred in a community; this observation generates a new hypothesis about disease transmission. The hypothesis, then, can be tested by conducting a case-control or cross-sectional study. For example, with UTI, a study can be performed to see if there is any association between a predominant clonal group of uropathogenic E. coli identified by a strain typing method and a risk factor. If so, this would provide evidence that there was indeed an outbreak. Here, the ultimate confirmation of the hypothesis will be made by showing that the removal of that risk factor leads to amelioration of the recognized problem.

Same questions arise in distinguishing organisms that also happen to colonize non-sterile niches of the human host or occur as saprophytes in the environment. Examples of the latter include S. pneumoniae, Helicobacter pylori, S. aureus, and Gram-negative bacteria other than E. coli. Approaches similar to those used to study ExPECs can be applied for these organisms.

Hospital-associated or nosocomial infections are defined elsewhere in this book (See Chapter 26). In developed countries, organisms that are most frequently associated with nosocomial infections are usually members of the "normal" flora in the non-sterile niche of a human host, which include the skin, nasopharynx, and colon, or saprophytic organisms in the hospital environment. The most common such agents of hospital infections include coagulase-negative Staphylococci, S. aureus, Enterococcus spp., Clostridium difficile, E. coli, Enterobacter spp., Klebsiella pneumoniae, Pseudomonas aeruginosa, Acinetobacter spp., Proteus spp., Stenotrophomonas maltophilia, Serratia spp., and Candida spp. Non-commensal or non-saprophytic pathogens that cause hospital infections include those transmitted by blood products (e.g., HIV, HBV, HCV, Yersinia enterocolitica) and respiratory pathogens (including severe acute respiratory syndrome [SARS]-associated coronavirus). In developing countries infections caused by M. tuberculosis and Salmonella introduced into a hospital by an index case, staff, visitors, or contaminated food are still common.

Genotyping bacterial organisms isolated in hospital or other institutional settings have become routine in many places, especially in developed countries. In fact, these techniques have become an indispensable part of hospital infection control activities. (88) While probe-based methods are most frequently used to detect drug-resistant organisms to augment clinical management of patients, many hospitals use genotyping tests to support nosocomial infection control practices. Teaching hospitals use genotyping methods to study nosocomial infection epidemiology.

Many of the infectious disease epidemiologic concerns in institutional settings are distinct from those related to field or community settings. While obvious outbreaks do occur in hospitals, very often, clusters of infectious diseases observed in hospitals are difficult to characterize as outbreaks. This is because patients in hospitals have hostrelated factors (immunosuppression, prolonged antimicrobial drug exposure, underlying illness) that predispose them to infection with "normal flora", saprophytes, or recognized pathogens. The distinction among colonization, contamination, and infection becomes blurred and is a frequent topic of discussion among hospital staff. One important question that arises in these settings, therefore, is, "do these organisms have biologic characteristics that enable them to cause disease or do they cause disease only because of host susceptibility factors?"

Another major area of concern in hospital infections, of course, is the drug-resistant infections. Very often, when risk factor studies for drug-resistant infections in hospitals are carried out, they find host-related factors-antimicrobial drug exposure, old age, underlying medical conditions, prolonged hospital stay, etc. Conventional epidemiologic studies rarely identify factors that can be attributed directly to the organisms themselves-sources or reservoir, transmissibility, and their biologic characteristics that determine disease production versus asymptomatic colonization. Thus, in hospital epidemiology, another important question is, "do antimicrobial drug-resistant microorganisms predominate in hospital settings because they are selected by the use of antimicrobial agents, or do they predominate because of distinct biologic characteristics possessed by a limited number of clonal groups that happen to be drug resistant?" Molecular epidemiologic methods are beginning to address these questions.

Given that normal flora and saprophytes found in hospital settings are comprised of a large number of strain types, one would expect drug selection pressure to generate a large number of drug-resistant strains to cause hospital infections. One of the most intriguing observations made recently about drug-resistant bacterial hospital infections, however, is that, despite strong antimicrobial drug selection pressures, most of these infections are actually caused by a limited set of bacterial strains. Most studies found that drug-resistant strains that are isolated from hospital invasive disease sources are comprised of a limited number of strains. In New York, a 2-year survey of vancomycin-resistant Enterococcus faecium (VRE) infections at a teaching hospital found 54 different PFGE subtypes among 182 isolates. One-third of these were accounted for by just two PFGE subtypes. (89) In Tennessee, a similar survey in different hospitals over a 3-year period showed that among 34 different PFGE types, one was responsible for 61% of 106 VRE isolates obtained from seven different hospitals. (90) In Tennessee, the investigators found that the introduction of the predominant PFGE types at different institutions was caused by new patients transferred to these institutions. Thus, these studies suggest that a clonal strain carried by a transfer patient, once introduced into a hospital setting, can rapidly become predominant in that hospital, and that such a mode of transmission is more important than the selective pressures of antimicrobial drugs in establishing a large number of drug-resistant nosocomial infections. Without the genotype data, such a conclusion would be difficult to make. This type of observation also helps to streamline nosocomial infection control practices. In this case, the recommendation based on the observation is to screen transfer patients for VRE before they are sent to a particular ward in the recipient hospital.

The observations described above with VRE apply to other hospital infections, including those caused by methicillin-sensitive (MSSA) or methicillin-resistant S. aureus (MRSA), E. coli, and other Gram-negative bacteria. It is beyond the scope of this chapter to discuss each of these agents of hospital infection, and the reader is encouraged to refer to reference (1) for further reading.

The next stage in the evolution of laboratory-based epidemiology is the methodology to identify genetic basis for disease transmission. This, of course, applies to both the host and the pathogen. Host factors that enhance the likelihood of development of a clinical manifestation after a bacterial infection could also render such hosts possible new sources of transmission to other hosts. Communities with a high prevalence of persons with HIV infection or AIDS facilitate greater likelihood of transmission of tuberculosis, just because of the greater prevalence of cases of tuberculosis. Thus, host genetic factors that have a similar effect on infection could contribute to enhanced transmission of a disease. Studies that examine host genetic factors that affect disease transmission are just beginning. (91) Of course, in addition to host factors, environment can greatly influence infectious disease distribution, including temperature, weather, season, urbanization, industrialization, and socioeconomic conditions. Such influences on disease transmission and epidemiology of infectious diseases require a separate chapter in itself. In this chapter, only pathogen genetic factors that facilitate transmission and therefore disease distribution are dealt with.

As mentioned earlier, one of the common molecular epidemiologic exercises involves attempts to explain overrepresentation of clones, clonal groups, or clonal complex strains in a community or institutional settings. Many of these can be explained on the basis of epidemiology, such as point-source contamination of a widely distributed food product (as in salmonellosis) or airborne transmission from a single index case (as in tuberculosis). However, we also know of many situations in which there is no obvious explanation for the predominance of one particular strain of a microorganism. Once epidemiologic explanation for the predominance is ruled out, possible biologic bases for this observation needs to be examined.

All pathogens have developed their own distinct strategies to transmit themselves to new hosts within a context of an environmental setting. Studies to date that have attempted to identify a pathogen's specific genetic determinants of transmission have used two different approaches: (1) those that attempt to identify a type of infection in a host that enhances opportunities for transmission to new hosts and (2) those that attempt to identify a unique niche or reservoir that becomes a major source of transmission to human or other animal hosts.

Tuberculosis can arise from an infection that occurred in the remote past (reactivation disease) or from a recent new infection (rapidly progressive disease). As mentioned earlier, over-representation of a particular strain of M. tuberculosis in a community could result from recent transmission from a highly infectious index case (e.g., someone with a cavitary lung lesion working in an enclosed space with an opportunity for contact with many others). In a study conducted in the early 1990s in New York City, one drug-susceptible strain of M. tuberculosis called "C strain" was found to be responsible for nearly 10% of all newly diagnosed cases of tuberculosis. (92) A series of case-control studies found C strain to be significantly associated with injection drug use. However, there was no obvious epidemiologic relationship among the injection drug users infected with this strain, nor was there any epidemiologic relationship among non-users infected with the same strain. This prompted investigators to look for a biologic basis for the predominance of C strain in New York City.

Ultimately, C strain but none of the other M. tuberculosis strains identified in the city was found to be resistant to reactive nitrogen intermediates (RNI) and hydrogen peroxide in vitro. (92) These are effector molecules that M. tuberculosis is likely to encounter inside macrophages in a host after infection. It was hypothesized that since injection drug users, who are normally a high-risk group for tuberculosis, regularly inject themselves with a variety of antigens that stimulate their macrophages to express RNI and hydrogen peroxide, such a practice will eventually select for RNI and hydrogen peroxide-resistant strain of M. tuberculosis to emerge in that risk group. If such a strain infects a non-user, it will cause a rapidly progressive disease because such a host does not carry macrophages making RNI and hydrogen peroxide, which would otherwise control the initial infection with M. tuberculosis. That is, infection/disease ratio increases when one is infected with such a strain. Thus, more people will develop rapidly progressive disease in a community where such a strain is circulating. This would explain the predominance of C strain in New York City. Hence, in this case, a human behavior appears to have selected for a new strain of M. tuberculosis that eventually affected the epidemiology of tuberculosis in New York City.

C strain was further studied in the laboratory and ultimately, a gene called noxR1 was identified that when transferred into RNI-susceptible E. coli or M. smegmatis strains rendered these hosts resistant to RNI. (93) Subsequently, additional genes that exhibited RNI and hydrogen peroxide resistance were identified, (94, 95) which opened a new area of tuberculosis pathogenesis research-study of the mechanisms of resistance to natural antimicrobial effector molecules elaborated by the human host against M. tuberculosis. These genes may not completely explain the mode of transmission and reason for the predominance of M. tuberculosis strain in a community, but these are the earliest studies that applied molecular epidemiologic methods to identify a particular clone of a clinical isolate to study its biologic basis for its distribution and occurrence in a community. Here, the over-representation in New York City of an M. tuberculosis strain was attributed to this strain's enhanced ability to cause rapidly progressive disease after a new infection. A mechanism by which an M. tuberculosis strain can cause rapidly progressive disease was sought; the strain's resistance to macrophage effector molecules was found to be possibly related to this mechanism. Genes that mediate this phenotype were identified.

The second type of approach used to identify a genetic basis for disease transmission focused on salmonellosis caused by S. enterica serotype Enteritidis. S. Enteritidis is the most common cause of egg-associated salmonellosis in developed countries. In a study of henhouses in the United States, 18-25% of captured mice were shown to be infected with S. Enteritidis, while S. Typhimurium was cultured from only 1.5% from the same henhouses. (96) In the same study, chicken eggs contaminated with S. Enteritidis were found in 9 (43%) of the 21 henhouses. (96) Unlike the other Salmonella serotypes that infect the surface of egg shells, S. Enteritidis infect eggs internally. (97) S. Enteritidis multiply inside the egg if left at room temperature. Thus, eggs serve as an "amplifier" of transmission to human hosts, which explain the unique epidemiology of salmonellosis caused by this pathogen. Thus, this second approach to identify a genetic basis for disease transmission attempted to look for a biologic mechanism for the establishment of a reservoir for a pathogen that then serves as a major source for human infection. The question, therefore, is, what enables S. Enteritidis and not the other Salmonella serotypes to survive in the egg?

This question was addressed by Lu et al., who examined multiple clinical isolates of S. Enteritidis for susceptibility to chicken egg albumen. They first found that most clinical isolates of S. Enteritidis were relatively more resistant to egg albumen than most clinical isolates of S. Typhimurium. (98) One S. Enteritidis strain that was highly resistant was analyzed further. Lu et al. eventually identified a gene called yafD that when transferred into an egg-susceptible strain of S. Typhimurium rendered it resistant. (98) Disruption of yafDin the original resistant S. Enteritidis strain made it more susceptible to egg albumen. The mechanism of resistance to albumen associated with yafD is not yet known.

Thus, once again in this example, a molecular epidemiologic approach first identified a clinical strain of S. Enteritidis, which when tested further for a distinct phenotype led to the identification of a gene associated with that phenotype. These types of studies should yield new understanding of how infectious disease transmission occurs and should provide novel approaches to disease control and prevention. These are just beginning to be applied in this new phase of molecular epidemiology research.

Molecular epidemiology should not be regarded as just another technique or a tool, but as an established discipline in epidemiology that can advance our knowledge about infectious diseases in ways that were not previously possible. Molecular epidemiology has begun to challenge old notions and reveal new paradigms concerning the epidemiology of infectious diseases. This chapter introduced new concepts that emerged within this discipline and a sample of its applications. The literature is vast and molecular epidemiologic methods are now a major component of most laboratorybased infectious disease epidemiologic studies. However, it is a constantly evolving discipline and should continue to provide new insights into elucidating epidemiology of infectious diseases in the future.

Molecular epidemiology of infectious diseases: Principles and practices

The introduction of agar-agar into bacteriology

Consensus guidelines for appropriate use and evaluation of microbial epidemiologic typing systems

Goering, and the Molecular Typing Working Group of the Society for Healthcare Epidemiology of America. How to select and interpret molecular strain tying methods for epidemiological studies of bacterial infections: a review for healthcare epidemiologists

Molecular epidemiology: the application of contemporary techniques to typing bacteria

Molecular tools for epidemiologic study of infectious diseases

Escherichia, Shigella, and Salmonella, InManual of Clinical Microbiology

Restriction fragment polymorphism in mitochondrial DNA of Cryptococcus neoformans

Strains and clones of Trypanosoma cruzi can be characterized by pattern of restriction endonuclease products of kinetoplast DNA minicircles

Comparison of plasmid profile analysis, phage typing, and antimicrobial susceptibility testing in characterizing Salmonella typhimurium isolates from outbreaks

Bacterial strain identification by comparative analysis of chromosomal DNA restriction patterns

Pathovars of Xanthomonas campestris are distinguishable by restriction fragment length polymorphism

Analysis of restriction fragment patterns from complex deoxyribonucleic acid species

Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology

Clonal relationship among bloodstream isolates of Escherichia coli

Separation of yeast chromosomesized DNAs by pulsed field gradient gel electrophoresis

Tauxe and the CDC PulseNet Task Force. PulseNet: The molecular subtyping network for foodborne bacterial disease surveillance, United States

An outbreak of Escherichia coli O157:H7 infection from unpasteurized commercial apple juice

Multistate outbreak of Salmonella serotype Agona infections linked to toasted oats cereal-United States

Multistate outbreak of listeriosis-United States

Evolutionary genomics of pathogenic bacteria

A gene-expression signature as a predictor of survival in breast cancer

Characterization of a novel coronavirus associated with severe acute respiratory syndrome

Genome diversification in Staphylococcus aureus: Molecular evolution of a highly variable chromosomal region encoding the Staphylococcal exotoxinlike family of proteins

Modeling bacterial evolution with comparative-genomebased marker systems: application to Mycobacterium tuberculosis evolution and pathogenesis

Global phylogeny of Mycobacterium tuberculosis based on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set

Light-directed, spatially addressable parallel chemical synthesis

High density synthetic oligonucleotide arrays

Genomics, gene expression, and DNA arrays

Comparative genomics of BCG vaccines by whole-genome DNA microarray

Comparing genomes within the species Mycobacterium tuberculosis

High-throughput method for detecting genomic-deletion polymorphisms

Widespread atypical cutaneous leishmaniasis caused by Leishmania chagasi in Nicaragua

Typing of dengue viruses in clinical specimens and mosquitoes by single-tube multiplex reverse transcriptase PCR

Leishmania chagasi: genotypically similar parasites from Honduras cause both visceral and cutaneous leishmaniasis in humans

Simultaneous approach for nonculture PCR-based identification and serogroup prediction of Neisseria meningitides

Differentiation of pathogenic E. coli in Brazilian children by polymerase chain reaction

Fingerprinting microbial genomes using the RAPD and AP-PCR method

Genomic fingerprints produced by PCR with consensus tRNA gene primers

DNA polymorphisms amplified by arbitrary primers are useful as genetic-markers

ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium, and other enterobacteria

Distribution of repetitive DNA sequences in eubacteria and application to fingerprinting of bacterial genomes

Repetitive extragenic palindromic sequences: a major component of the bacterial genome

A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae

Clonally related penicillin-nonsusceptible Streptococcus pneumoniae serotype 14 from cases of meningitis in Salvador, Brazil

Epidemiological typing of Streptococcus pneumoniae from various sources in Sweden and India using Box A PCR fingerprinting

Colonization by Streptococcus pneumoniae among human immunodeficiency virus-infected adults: prevalence of antibiotic resistance, impact of immunization, and characterization by polymerase chain reaction with BOX primers of isolates from persistent S. pneumoniae carriers

Novel BOX repeat PCR assay for high-resolution typing of Streptococcus pneumoniae strains

Rapid, simple method for typing isolates of Mycobacterium tuberculosis by using the polymerase chain reaction

Double repetitive element PCR method for subtyping Mycobacterium tuberculosis clinical isolates

Molecular fingerprinting of Mycobacterium tuberculosis isolates obtained in Havana, Cuba, by IS6110 restriction fragment length polymorphism analysis and by the double repetitive element PCR method

Differentiation of strains in Mycobacterium tuberculosis complex by DNA sequence polymorphisms, including rapid identification of M. bovis BCG

Genetic diversity in the Mycobacterium tuberculosis complex based on variable numbers of tandem DNA repeats

PCR amplification of variable sequence upstream of katG gene to subdivide strains of Mycobacterium tuberculosis complex

Characterization of a major polymorphic tandem repeat in Mycobacterium tuberculosis and its potential use in the epidemiology of Mycobacterium kansasii and Mycobacterium gordonae

A novel repeated DNA sequence located in the intergenic regions of bacterial chromosomes

Characterization of a tandem repeat polymorphism in Legionella pneumophila and its use for genotyping

New method for typing Staphylococcus aureus strains: multiple-locus variable-number tandem repeat analysis of polymorphism and genetic relationships of clinical isolates

Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis

Identification of a new DNA region specific for members of Mycobacterium tuberculosis complex

Identification of novel intergenic repetitive units in a mycobacterial two component system operon

Mycobacterial interspersed repetitive unit typing of Mycobacterium tuberculosis compared to IS6110-based restriction fragment length polymorphism analysis for investigation of apparently clustered cases of tuberculosis

High-resolution minisatellitebased typing as a portable approach to global analysis of Mycobacterium tuberculosis molecular epidemiology

Automated high-throughput genotyping for study of global epidemiology of Mycobacterium tuberculosis based on mycobacterial interspersed repetitive units

Nature of DNA polymorphism in the direct repeat cluster of Mycobacterium tuberculosis: application for strain differentiation by a novel method

Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology

Spoligotyping followed by double-repetitive-element PCR as rapid alternative to IS6110 fingerprinting for epidemiologic studies of tuberculosis

Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic organisms

Multilocus sequence typing for characterization of methicillin-resistant and methicillin-susceptible clones of Staphylococcus aureus

Necrotizing fasciitis caused by community-associated methicillin-resistant Staphylococcus aureus in Los Angeles

Population dynamics of nasal strains of methicillin-resistant Staphylococcus aureus-and their relation to community-associated disease activity

Drug-resistant Salmonella infections in the United States: an epidemiologic perspective

Multistate outbreaks of salmonellosis associated with precooked roast beef

Evaluation of isolated cases of salmonellosis by plasmid profile analysis: introduction and transmission of a bacterial clone by precooked roast beef

The significance of hospitals as reservoirs for endemic multiresistant Salmonella typhimurium causing infection in urban Brazilian children

Ocorrencia de bacterias enteropathogenicas em São Paulo no Septenio 1970-1976. II. O surto epidemico de Salmonella typhimurium em São Paulo

Importance of host factors in human salmonellosis caused by multiresistant strains of Salmonella

Use of molecular subtyping in surveillance for Salmonella enterica serotype Typhimurium

Diarrheagenic Escherichia coli

Proposal for a new inclusive designation for extraintestinal pathogenic isolates of Escherichia coli: ExPEC

Widespread distribution of urinary tract infections caused by a multidrug resistant Escherichia coli clonal group

Escherichia coli serotype O15:K52:H1 as an uropathogenic clone

Escherichia coli bacteraemia: serotype O15:K52:H1 as a urinary pathogen

Cluster of multiresistent Escherichia coli O78:H10 in Greater Copenhagen

Possible animal origin of human-associated, multidrugresistant, uropathogenic Escherichia coli

Antimicrobial-resistant and extraintestinal pathogenic Escherichia coli in retail foods

Contamination of retail foods, particularly turkey, from community markets (Minnesota, 1999-2000) with antimicrobial-resistant and extraintestinal pathogenic Escherichia coli

Medical and economic benefit of a comprehensive infection control program that includes routine determination of microbial clonality

Multiplicity of genetic backgrounds among vancomycin-resistant Enterococcus faecium isolates recovered from an outbreak in a

Clinical and molecular characterization of vancomycin-resistant Enterococcus faecium strains during establishment of endemicity

Infectious Diseases

Widespread dissemination of a drug-susceptible strain of Mycobacterium tuberculosis

A novel antioxidant gene from M. tuberculosis

noxR3, a novel gene from Mycobacterium tuberculosis, protects Salmonella Typhimurium from nitrosative and oxidative stress

Oxidative stress response genes in Mycobacterium tuberculosis: role of ahpC in resistance to peroxynitrite and stage-specific survival in macrophages

On-farm monitoring of mouse-invasive Salmonella enterica Serovar Enteritidis and a model for its association with the production of contaminated eggs

Production of Salmonella Enteritidiscontaminated eggs by experimentally infected hens

Association of Salmonella enterica serovar Enteritidis YafD with chicken egg albumen resistance

Molecular tools for epidemiologic study of infectious diseases

Epidemiologic typing systems

Overview and significance of molecular methods: what role for molecular epidemiology?

The use of molecular methods in infectious diseases

Role of genomic typing in taxonomy, evolutionary genetics, and microbial epidemiology

DNA fingerprinting techniques for microorganisms. A proposal for classification and nomenclature

PCR Primer: A laboratory manual

A low-cost approach to PCR

PCR Protocols: Current Methods and Applications