key: cord-322062-nnefbeo6 authors: Tam, Albert W.; Smith, Matthew M.; Guerra, Martha E.; Huang, Chiao-Chain; Bradley, Daniel W.; Fry, Kirk E.; Reyes, Gregory R. title: Hepatitis E virus (HEV): Molecular cloning and sequencing of the full-length viral genome date: 1991-11-30 journal: Virology DOI: 10.1016/0042-6822(91)90760-9 sha: doc_id: 322062 cord_uid: nnefbeo6 Abstract We have recently described the cloning of a portion of the hepatitis E virus (HEV) and confirmed its etiologic association with enterically transmitted (waterborne, epidemic) non-A, non-B hepatitis. The virus consists of a single-stranded, positive-sense RNA genome of approximately 7.5 kb, with a polyadenylated 3' end. We now report on the cloning and nucleotide sequencing of an overlapping, contiguous set of cDNA clones representing the entire genome of the HEV Burma strain [HEV(B)]. The largest open reading frame extends approximately 5 kb from the Fend and contains the RNA-directed RNA polymerase and nucleoside triphosphate binding motifs. The second major open reading frame (ORF2) begins 37 by downstream of the first and extends approximately 2 kb to the termination codon present 65 by from the 3' terminal stretch of poly(A) residues. ORF2 contains a consensus signal peptide sequence at its amino terminus and a capsid-like region with a high content of basic amino acids similar to that seen with other virus capsid proteins. A third open reading frame partially overlaps the first and second and encompasses only 369 bp. In addition to the 7.5-kb full-length genomic transcript, two subgenomic polyadenylated messages of approximately 3.7 and 2.0 kb were detected in infected liver using a probe from the 3' third of the genome. The genomic organization of the virus is consistent with the Fend encoding nonstructural and the 3' end encoding the viral structural gene(s). The expression strategy of the virus involves the use of three different open reading frames and at least three different transcripts. HEV was previously determined to be a nonenveloped particle with a diameter of 27–34 nm. These findings on the genetic organization and expression strategy of HEV suggest that it is the prototype human pathogen for a new class of RNA virus or perhaps a separate genus within the Caliciviridae family Viral hepatitis results from infection with one of at least four very different viral agents. Available serological tests allow the diagnosis of acute hepatitis due to infection with hepatitis A virus (HAV) and hepatitis B virus (HBV). HBV is required for propagation of the delta agent, or hepatitis D virus (HDV); this co-infection results in a high proportion of cases progressing to chronic active hepatitis. The clinical and diagnostic exclusion of HAV and HBV led to the recognition of other viral hepatitides that were formerly grouped together as non-A, non-B hepatitis (NANBH) (Prince et a/., 1974; Feinstone et a/., 1975; Tabor, 1985) . NANBH is caused by more than one viral agent and can be transmitted by either parenteral or fecal/oral routes (Bradley, 1990a; Reyes and Baroudy, 1991) . The cloning of a blood-borne agent, termed hepatitis C virus (HCV) by us and others led to the development of a specific assay for circulating antibody to HCV (Choo etal., 1989; Kuo et al., 1989; Kubo eta/, 1989; Maeno et al., 1990; Reyes et a/., 1991 d) . This assay predomi-nantly detects infections at the chronic stage, but has facilitated the identification of HCV as the cause of up to 90% of parenterally transmitted NANBH. A second epidemiologically distinct form of NANBH was shown to occur in both epidemic and sporadic patterns in developing countries and is referred to as enterically transmitted non-A, non-B hepatitis (ET-NANBH) due to its water-borne mode of virus transmission and presumed enteric route of infection (Khuroo, 1980; Wong et a/., 1980) . ET-NANBH has been documented in India, Pakistan, Burma, USSR, Costa Rica, Mexico, and countries in Africa, where epidemic outbreaks can generally be traced to fecal contamination of drinking water (Bradley and Maynard, 1986; Bradley, 1990b) . The causative viral agent was previously shown to passage successfully in cynomolgus macaques (cyno) and tamarins with typical liver enzyme elevations and recovery of morphologically similar 27-to 34-nm viruslike particles from the feces of clinical specimens and experimental animals (Balayan et al,, 1983; Anjaparidze et al., 1986; Bradley et al., 1987; Arankalle et al., 1988) . We recently reported the isolation of a partial cDNA clone from the virus responsible for ET-NANBH, and have termed the newly identified agent the hepatitis E virus (HEV) (Reyes et a/., 1990) . The clone was from a Burma isolate of HEV and hybridized with cDNA made from five other distinct geographic isolates. These molecular epidemiological findings are consistent with the available serologic data based on the use of immune electron microscopy and immunofluorescence blocking studies that indicate a single major agent is responsible for the majority of ET-NANBH seen worldwide (Purcell and Ticehurst, 1988; Bradley et al., 1988a; Krawczynski and Bradley, 1989) . We now report on the molecular cloning and sequencing of the complete HEV (Burma; B) viral genome together with the deduced amino acid sequences of viral-encoded proteins General perspectives on the genetic organization of the virus, as deduced from sequence and open reading frame analyses, indicate that HEV bears some similarity to the caliciviridae but may represent a new class of nonenveloped RNA virus. RNA purification. Total cellular RNA was isolated from normal and HEV(B)-infected cyno livers by the guanidinium-LiCI precipitation method (Cathala et al., 1983) and pofy(A)+ RNA was selected by one round of oligo(dT) cellulose chromatography (Aviv and Leder, 1972) . cDNA library construction and screening. Synthesis and screening of the infectious bile cDNA library had previously been described (Reyes et al., 1990) . Oligo(dT)-, random hexamer-, and HEV sequence-specific oligomer-primed (primer A, see Fig. 1 ) cDNA were synthesized using a commercially available cDNA synthesis kit (Boehringer-Mannheim Biochemicals, Indianapolis, IN), ligated to EcoRl linker-adapters and cloned into X gtl0 (Stratagene, San Diego, CA). G-tailed cDNA was made essentially as described before (Tam et al., 1989) . Briefly, first strand cDNA primed with HEV sequence-specific primer C (see Fig. 1 ) was tailed with dGTP using terminal deoxynucleotidyl transferase. The modified cDNA was then amplified in a polymerase chain reaction (PCR) (Saiki et al., 1985; Mullis and Faloona, 1987) employing the same synthetic HEV primer and an oligo(dC) primer, both of which contained an EcoRl cloning site at the 5' end. All four cDNA libraries were screened with appropriate synthetic oligomer probes (Applied Biosystems, Foster City, CA) described under Results. Hybridizations were generally performed in duplicate (using 32P kinased probes) at 42" in 30% formamide, 5X SSC, 5X Denhardt's (0.1% ficoll, 0.1% polyvinylpyrrolidone, 0.1% BSA), 50 mM sodium phosphate, pH 7.0, and 50 @g/ml salmon sperm DNA. After an overnight hybridization, filters were washed three times with 0.2X SSC and 0.1% SDS at 37-42" depending on the length of the oligomer probe. Primer extension analysis. Primer extension studies were carried out using oligonucleotide primers kinased to a specific activity greater than 3 X 10' cprn/pg with [T-~~P]ATP (ICN Radiochemicals, Irvine, CA) essentially as described (McKnight et a/., 1981) . Extension products were separated on a 6% pofyacrylamide-8 M urea sequencing gel that was subsequentty dried and autoradiographed. The sequences forthe HEV primers used in these studies are: Primer A: 5'-CCCGATAAGCAGCCTCAAGCCTC-3' Primer B: 5'-CCGCGTACACACTAACCCCCCGGC-CAATAAT-TCACGCTGG-3' Primer C: 5'-CAAGCTGGCGAGGTTGCATTAGG-3' Primer D: 5'-ACAGCATICGCCAGGGCAGAGTT-3' Northern blot analysis. Four micrograms of HEV(B)infected cyno liver poly(A)+ RNA was electrophoresed on a 1.2% agarose gel containing 2.2 M formaldehyde and transferred onto a nitrocellulose filter. The filter was hybridized under high stringency conditions with a radiolabeled BETG-1 EcoRl fragment insert (5 X lo* wmhg). DNA nucleotide sequencing. DNA sequencing was performed by the dideoxynucleotide method @anger et a/., 1977) using 7-deaza-dGTP (Pharmacia, Piscataway, NJ). All sequencing reactions were carried out on both strands using Bluescript plasmid (Stratagene, San Diego, CA) subclones obtained from HEV XgtlO phage clones. Appropriate overlapping subfragments were exploited wherever possible, or adjoining dissimilarend subclones were employed for unambiguous orientation. Sequencing primers were commercially available or synthesized based on derived HEV sequences. 7-deaza-dGTP eliminated areas of compression due to the high G + C content of the viral genome (see Results) . Computer analyses of nucleotide and amino acid sequences. Computer programs for manipulation of nucleic acid and protein sequences were obtained from lntelligenetics (Mountain View, CA). A partial HEV cDNA clone, ET1 .l, was isolated by differential screening of a cDNA library constructed from infectious bile collected from a third-passage cyno inoculated with subpassaged fecal suspensions originally derived from Burma patients with weff-defined ET-NANBH (Reyes et al., 1990) . Bife was chosen as the RNA source for cDNA synthesis because it contained relatively large numbers of virus particles when HEV cDNA clones were identified from libraries made from randomly primed cyno bile (solid square), or from cyno liver after priming by oligo-dT (solid circle), random sequence hexamers (open circle) and HEV-sequence specific oligonucleotides (open square). The designations given to the various clones are indicated together with their sizes and relative position and overlap along the -7.5 kb genome. A and B represent synthetic oligonucleotides used for the generation and screening, respectively, of specifically primed cDNA libraries, The anchor PCR strategy using G-tailing and PCR (primer C) was used in the synthesis of primer extension libraries for the extreme 5' end. The procedure yielded numerous clones by hybridization with primer D, of which BET-EXPCR2 is a representative example. Primer extension studies confirmed the 5' extent of the viral genome (see Fig. 2 ). The BET1 clone contained a long stretch of poly(A) residues at its 3' end indicating its position at the 3' terminus of the viral genome. compared with fecal preparations. It was also expected that the lower sequence complexity would enhance the sensitivity of the differential (plus/minus) screening protocol used for clone identification. ET1 .l contained a 1.3-kb EcoRl fragment that was exogenous to both human and cyno genomic DNA and specifically hybridized to cDNA derived only from infected sources (Reyes et al., 1990) . Oligonucleotides based on the end sequences of ET1 .1 were used as hybridization probes to rescreen the original bile-derived cDNA library. The largest identified clone, BETG-1, contained a 2.6-kb EcoRl insert. Restriction mapping revealed that the original ET1 .l clone was contained within the larger BETG-1 ( Fig. 1; Fry et a/., 1991) . The same end-probe strategy was used with oligonucleotides derived from BETG-1 to screen oligo(dT)primed and random hexamer-primed HEV(B)-infected cyno liver cDNA libraries. A collection of overlapping clones was identified from both libraries (Fig. 1) . One of the oligo(dT)-primed clones, BET1 contained two EcoRl fragments that comprised 2.4 kb in total length. The authenticity of the EcoRl site was strengthened by its presence in another clone, BET4, isolated from the random-primed cDNA library. A long poly(A) stretch of -150-200 adenosine residues was located at the 3' end of BET1 confirming the original observation that genomic RNA could be selected on oligo-dT cellulose . This result indicated that the 3' end of the viral genome was present in the BET1 clone. The 5' end of the viral genome was isolated from a cDNA library made by primer extension using a synthetic 23-bp oligonucleotide complementary to the 5' end of clone BET8 (primer A, see Fig. 1 ). One of two positive clones identified by an oligonucleotide probe (primer B), located 5'to the specific primer, was clone BET-SPl , This clone contained a single large insert of 2.6 kb. With the acquisition of BET-SPl, the composite cDNA map (omitting overlaps) spanned approximately 7.4 kb from the 5' end of BET-SPl to the polyadenylated 3'end of clone BET1 ; in good agreement with the maximum length of HEV RNA as detected on Northern blots. The 5' end of BET-SPl was therefore believed to be in close proximity to the putative 5' end of the viral genome. Primer extension studies using poly(A)-selected RNA from infected cyno liver were performed in order to firmly establish the distance from the existing 5'end of BET-SPl to the end of the genome (Fig. 2) . Two specific oligonucleotide primers (primers C and D, see Fig. 1 ) were synthesized 143 and 72 bp from the 5'end of BET-SPl and used to prime cDNA synthesis after 32P labeling their 5' ends with polynucleotide kinase. The resulting extension products for each synthesis reaction were, respectively, 50 and 51 bp longer than the expected product, thereby suggesting that the 5' end of BET-SPl was about 50 nucleotides from the 5' end of the virus (Fig. 2) . After several failed attempts at cloning the remaining 5' end sequences by oligonucleotide hybridization of specifically primed cDNA libraries, an alternative expansion/enrichment procedure of PCR amplification of specifically primed G-tailed cDNA was applied (Tam et al., 1989 ). An atiquot of the amplified material was fcoRl digested, electrophoresed, blotted, and probed with a 5' internal HEV oligomer (primer D). This hybridization study confirmed the amplification of the &aired HEV extension products (data not shown). This same DNA, after preparative gel etectrophoresis, was recovered and ligated into Xgtl 0. The specific priming procedure (followed by PCR amplification) resulted in a high percentage (over 10%) of HEV-positive recombinants in the enriched library. BET-EXPCR2 is a representative clone from over 50 analyzed; all of these were 50 bp in length and therefore in agreement with the primer extension experiment. The isolation of BET-EXPCR2 completed the HEV genomic cDNA cloning. The entire nucleotide and deduced amino acid sequence of HEV are presented in Fig. 3 . The nucleotide composition of the HEV genomic RNA is 17% A, 32% C, 26% G, and 25% U, conferring an overall G -I-C content of 58%. Sequence homology to any nucleotide sequences contained in the GenBank database could not be detected when the HEV sequence was searched in either the forward or reverse orientation. Only two regions were identified that had homology with previously described nonstructural gene elements present in other positive strand RNA viruses (see below; . Using the COD RNY sequence analysis program, the -7.2-kb of HEV sequence, exclusive of the 3' poly(A) tract, was analyzed for the presence of open reading frames (ORF) in the six possible translation frames (Fig. 4) . The identification of the RNA-dependent RNA polymerase in the original ET1 .l clone (Reyes et al., 1990) and strand-specific probe hybridization (Reyes et al., 199 1 b) established the positive-sense orientation of the HEV genome. A representation of the potential ORFs and stop codons in the three positive-polarity frames is presented in Fig. 4a . Two large potential ORFs were found in the first and second reading frames. ORFl begins at the 5' end of the viral genome after 27 bp of apparent noncoding sequence at the 5' end, and then extends 5079 bp before termination at nucleotide position 5107. The second major ORF (ORF2) begins at nucleotide position 5147 and extends 1980 bp before terminating 65 bp upstream of the poly(A) tail. The termination of ORFl and the transition into ORF2 was confirmed by sequencing the region in question five times using two different HEV sequencespecific primers. The sequence of a second clone in this region yielded the same results. Furthermore, cDNA clones isolated directly from infected human (Huang et al., 1991 , data not shown). A third positive-polarity reading frame of 369 bp (ORF3) overlaps both ORFl and ORF2 and was found by independent experiments to encode an immunoreactive epitope recognized by sera from HEV-infected humans and animals (Yarbough et af., 1991; Reyes et a/., 1991b) . No ORFs greater than 590 bp were identified by computer search of the negative-polarity RNA strand (Fig. 4b) . The nucleotide frequencies at each codon position were also analyzed and a comparison was made with two other hepatotropic positive-strand RNAviruses (Ta-ble 1). The overall frequencies for HEV and HCV are similar (-58% G + C), but differ markedly from that of HAV (37% G + C). The relatively high G + C content results in a higher overall frequency of codons containing G + C throughout the HEV coding sequence. This contrasts with the CG dinucleotide discrimination in the second and third position that has been noted in human coding sequences (Nussinov, 1981) . There appears to be a slight selection for codons ending in C, which is also seen with HCV; however, the discrimination against codons ending in A is far more apparent (-9%) and is a unique fea?ure of HEV when compared to HAV and HCV. The third position discrimination a lluIulllluul u1111Jul1uuluI111u lll1uuuul1 11111 JUI I 11 b iu UUUU~IU 1~11 P I UJ L IUIU I I UII I u UUULII ~ui.~uu~~ Computer generated open reading frame analysis of the entire HEV nucleotide sequence is presented in both the forward (a) and reverse directions (b). The positions of all termination codons are depicted by arrows. The three forward ORFs are numbered 1, 2, and 3 and those on the opposite strand similarly labeled. The forward (positive-sense) orientation was defined by strand-specific hybridization of genomic RNA (Reyes et al.. 1991 b) and the identification of consensus sequence motifs related to nonstructural gene products in ORFl (Kamer and Argos, 1984) . The horizontal line running through the various ORFs indicates that ORF with the highest probability of encoding protein as predicted by the algorithm devised by Shepherd based on the RNY codon analysis in all three ORFs (Shepherd, 198 1) . The hydrophilic@ plot of ORF2 is presented in (c). The dotted line plotted at the -5 value on they-axis represents the midpoint with hydrophobic domains above and hydrophilic domains below the dotted line. Note the large hydrophobic region at the beginning of the sequence that marks the putative signal sequence highlighted in Fig. 3. against A is shared by the structural ORF region of another positive-strand RNA virus, rubella virus, where the frequency of A is only 7% (Frey and Marr, 1988) . HEV and HCV are also similar in their apparent preference for G in the first coding position (35 and 33%, respectively). The computer translation of the partial nucleotide sequence from clone ET1 .1 led to the detection of a conserved amino acid motif recognized in all positivestrand RNA viruses (Reyes er a/., 1990; Fry et a/., 199 1). The canonical Gly-Asp-Asp (GDD) tripeptide (amino acids 1550-l 552, identified by asterisks in Fig. 3) is believed to encode a portion of the RNA-dependent RNA polymerase (RDRP) gene critical to viral replication (Kamer and Argos, 1984) . Translation of the complete ORFl revealed a second region 5' to the RDRP gene bearing similarity to another nonstructural gene product (Fry et a/., 199 1) . Two well-conserved sequence motifs have been found in association with purine nucleoside triphosphate (NTP)-binding activity (Gentry, 1985; Strauss and Strauss, 1988) . The first, site A (G/AXXXXGKS/T), is represented in the HEV sequence by GVPGSGKS at amino acid position 975-982 (underlined in Fig. 3) . A version of the second NTP- binding motif, site B (DEAD), occurs approximately 46 amino acids downstream (3') from site A and is represented in HEV by the partially conserved amino acid sequence DEAP at position 1029-l 032 (underlined in Fig. 3 ). The latter site is believed to interact with the Mg+2 cation of the Mg-NTP complex for RNA-or DNA-dependent NTPase activity. A superfamily of helicases involved in replication, recombination, and DNA repair has been described with consensus features similar to those described here for NTP-binding (Gorbalenya eta/., 1989). These nonstructural genesimilarities are seen in other geographically distinct isolates of HEV and may indicate a putative helicase function for this region. The localization of an NTP-binding domain and the RDRP gene to ORFl is consistent with a genomic organization where the nonstructural genes are expressed from the 5' end of the viral genome. Translation of the second major open reading frame, ORF2, indicated a novel polypeptide not present in the PIR protein database. The hydropathicity plot of the sequence indicated a large hydrophobic domain at the amino terminus of ORF2 followed by a hydrophilic electropositive peak (Fig. 4~ ). The hydrophobic region marks a typical signal sequence (amino acids 5 to 22) and contains a potential cleavage site (PAIPPP) as predicted by the lntelligenetics eukaryotic secretory signal sequence program. In ORF2, between residues 22 and 322, nearly 10% of the amino acids are arginine conferring a high isoelectric point (pl = 10.35) to the first half of the ORF2 polypeptide. The basic charge of capsid proteins is believed to indicate their involvement in the encapsidation of the genomic transcript by effectively neutralizing the electronegatively charged RNA (Dalgarno ef al,, 1983; Rice et a/., 1985) . The mechanism of capsid assembly in HEV, and the exact nature of the membrane targeting (if any) of the ORF2 polypeptide, will require further study. Such studies will be facil-itated by the availability of an appropriate in vitro propagation system for HEV and immunospecific anti-HEV reagents. The utilization of ORF2 was substantiated by the independent isolation of a cDNA clone by immunoscreening of a hgtl 1 cDNA expression library made from the HEV (Mexico) isolate (Yarbough et a/., 1991) . That xgt 11 clone mapped to the 3' end of ORF2. These same experiments identified a second cDNA epitope clone that was localized to ORF3: the third positive-polarity open reading frame that overlaps both ORFl and ORF2. The fact that sera from acutely infected humans and animals detected HEV antigens encoded by ORF2 and ORF3 confirmed their expression and established that the virus utilized all three positive-polarity reading frames. The presence of a consensus s&al sequence motif in ORF2, together with the immunodominant seroreactivity of an identified epitupe (Yarbough et a/., 1991) suggested that the viral structural protein(s) were encoded by this region of the genome. The mechanism by which ORF2 and ORF3 are expressed was suggested by a Northern Mot hybridization using the BETG-1 clone as probe ( fig. 5 ). In addition to the previously identified poly(A) transcript of -7.5 kb, the probe also hybridized to subgenomic messages of 2.0 and 3.7 kb present in the infected cyno liver. It is of note that ET1 .l did not originally identify these subgenomic messages (Reyes et al., 1990) and other Northern blot studies using probes located 5' to ET1 .l also did not hybridize to these viral-specific transcripts (data not shown). These same subgenomic messages were identified in poly(A)-selected R#A from HEV(M)-infected cyno liver when the epitope-encoding clones were used as probes (Yarbough et al., 1991) . The ORF2 epitope is located at the extreme 3' end of that reading frame (Yarbough eZ a/., 1991) , therefore indicating that these messages may be co-terminal with the 3' end of the viral genomic transcript. It is pos- FIG. 5 . Northern blot analysis of HEV (Burma)-infected cyno liver RNA. Three HEV transcripts were detected using the 2.6-kb EcoRl insert from BETG-1 as probe. Numbers to the left represent the sizes of the three hybridizing RNA species as determined relative to RNA size markers. HEV cDNA probes were negative against similarly preoared RNA from uninfected liver (data not shown). sible that these polyadenylated subgenomic messages are used in the expression of ORF2 and ORF3. ET-NANBH has been well-documented in both sporadic and epidemic outbreaks throughout the developing world. Hepatitis E virus has been established as the major causative agent of ET-NANBH by the association of HEV-specific sequences with human specimens derived from six geographically diverse epidemics and also through the detection of these same sequences in various specimens derived from experimentally infected animals (Reyes et al., 1990) . HEV viral particles recovered from infected patients are similar to those recovered from infected primates. The virus contains a single-strand, positive-sense RNA genome of approximately 7.5 kb. The nucleotide sequence described here comprises 7 194 bases excluding the poly(A) tail. If the 3' stretch of adenosine residues (at least 150-200 nucleotides) is included, the determined sequence agrees well with the genome size originally estimated by Northern hybridization studies (Reyes et al., 1990) . Open reading frame analysis of the nucleotide sequence revealed two major positive-polarity ORFs. A portion of ORFl appears to encode the RDRP gene of the virus. The highly conserved amino acid residues, including the invariant GDD tripeptide found in all posi-tive-strand animal and plant RNA viruses, can be located in the deduced amino acid sequence (Reyes et a/., 1990; . Additional evidence for the encoded polyprotein having a function in viral replication is provided by the presence of conserved motifs involved in purine NTPase activity found in a variety of cellular and viral helicases (Geider and Hoffman-Berling, 1981) . These helicases promote the unwinding of DNA, RNA, or DNA-RNA duplexes required for genome replication, recombination, repair, and transcription. The deduced amino acid sequence of ORF2 suggests that it encodes a capsid-like peptide following the canonical signal sequence at its 5'end. ORF2 would appear to be the major ORF encoding the viral structural protein(s). An identified immunoreactive epitope in ORF3 indicates that the virus utilizes all three positive-polarity frames for encoding viral proteins (Yarbough et al., 199 1; Reyes et a/., 199 1 c) . This pattern of gene expression employed by HEV has not been described in the various families of single-stranded positive-sense, nonenveloped RNA viruses affecting humans or animals. Among the enveloped RNA viruses, the structural proteins of rubella virus and certain alphaviruses are found in a different reading frame from those encoding the nonstructural proteins and are also expressed from a subgenomic 3'end transcript (Ou eta/., 1982) . The presence of HEV-specific subgenomic RNAs localized to the 3' one-third of the genome suggests that these may be the transcripts from which these 3' end ORFs are expressed and is indicative of a unique expression strategy among nonenveloped positive-sense RNA viruses infecting humans. The mechanism by which these subgenomic transcripts are generated is unknown. The differential abundance of the various messages (i.e., 7.6 kb > 2 kb > 3.7 kb; see Fig. 5 ) does, however, suggest active transcriptional regulation rather than genomic RNA fragmentation as the means by which these subgenomic messages are generated. This would in turn imply the existence of an internal RNA initiation sequence and expression from the anti-genomic strand. Experiments are currently in progress to map the 5'ends of these subgenomic transcripts. We at this time, however, cannot exclude other mechanisms of expression for ORF2 and ORF3 including frameshifting or internal translation initiation, although there is little evidence for the latter among other positive-sense RNA viruses (March and Haenni, 1987) . It is also possible that complex RNA splicing could account for these subgenomic messages although there is evidence that Northern hybridization probes from the extreme 5' end failed to detect hybridization to the 3.7-and 2.0-kb messages (A. W. Tam, unreported) . 6 . HEV genomic organization: The proposed organization of the HEV genome is presented with the nonstructural genes encoded by the 5' ORFl and the structural genes located at the 3' end of the genome (ORF2 and possibly ORF3). The genomic organization, nature of the virus particle (enveloped or nonenveloped), presence of subgenomic messages, and the presence of a 3'terminal poly(A) addition is compared for the various positive-sense, single-stranded RNA virus families. The relative locations of the various virus sequence motifs is &so indicated, including the: HEL, putative helicase motif or NTP binding domain; POL, RNA-directed RNA polymerase; SS, signal sequence; IRE, immunoreactive epitope. S, structural gene coding region; NS, nonstructural gene coding region. It is postulated from the proposed genomic organization of HEV, as presented in Fig. 6 , that the nonstructural viral proteins are translated from the full-length genomic RNA. The 5' nonstructural/3 structural genomic organization of HEV is similar to that found in the alphavirus, rubivirus, and coronavirus families (see Fig. 6 ). There is an absence of any significant homology with these enveloped viruses at both the nucleotide and amino acid levels (excluding the canonical amino acid residues noted above for the nonstructural gene products). Immune electron microscopy has clearly established that the virions of HEV are 27-to 34-nm nonenveloped viral particles and are therefore clearly distinguished from these enveloped viruses. Picornaviruses are small nonenveloped, single-stranded, positive-sense, polyadenylated RNA viruses. The various members of the picornaviridae, however, exhibit vastly different genomic organization (Siddell, 1987) . HEV has been shown to be unrelated antigenically and biophysically to picornaviruses (Arankalle, 1988) . It was previously hypothesized that HEV is calicivirus-like based on the biophysical characterization of viral particles (Bradley and Balayan, 1988; Bradley et a/., 1988a,b) . Recently the nucleotide sequence of a large portion of the nonstructural gene region of feline calicivirus (FCV) has become available for comparison to HEV (Neill, 1990) . Although having a similar overall genomic organization to that of HEV with 5'-nonstructural and 3'-structural genes, it is clear that FCV shares a higher degree of similarity with picornaviruses in the recognized nonstructural gene motifs. The proposed gene order for the nonstructural polypeptides in FCV is 2C (NTP-binding), 3C (cysteine protease), followed by the 3D gene (RDRP). The distance between the NTP-binding site motif A and the GDD triplet of the RORP is 1100 amino acids in FCV compared to rhe 568 amino acids in HEV. In addition there is no evidence in the HEV sequence for an intervening cysteine prateaselike region as recognized in FCV. These finding8 would further suggests that HEV represents either the prototype member of an as yet unclassified novel virus family or perhaps a separate genus within the calciviridae. It is too early, however, to propose 8 dlfinitive classification of HEV beyond the hypothesis presented here based on the proposed genetic organization and expression strategy of the virus. fecal-oral transmitted non-A, non-B hepatitis induced in monkeys Aetiological association of a virus-like particle with enterically transmitted non-A, non-B hepatitis Purification of bioIogi&fy active globin messenger RNA by chromatography on al&o-thymidyfic acidcellulose Evidence for a virus in non-A, non-6 hepatitis transmitted via the fecal-oral route Hepatitis non-A, non-B viruses become identified as hepatitis C and E viruses Enterically-transmitted non-A. non-B hepatitis Virus of enterically transmitted non-A, non-B hepatitis Etiology and natural history of post-transfusion and enterically-transmitted non-A, non-B hepatitis Enterically transmitted non-A, non-B hepatitis: serial passage of disease in cynomolgus macaques and tamarins and recovery of disease-associated 27-to 34-nm viruslike particles Enterically transmitted non-A, non-B hepatitis: Etiology of disease and laboratory studies in nonhuman primates Aetiological agent of enterically transmitted non-A, non-B hepatitis A method for isolation of intact translationally active ribonucleic acid Isolation of cDNA clone derived from a blood-borne non-A, non-B viral hepatitis genome Ross river virus 26s RNA: Complete nucleotide sequence and deduced sequence of the encoded structural proteins Transfusion-associated hepatitis not due to viral hepatitis type A or B Sequence of the region coding for virion proteins C and E2 and the carboxy terminus of the nonstructural proteins of rubella virus: Comparison with alphaviruses Hepatitis E virus (HEV): Strain variation in the nonstructural gene region encoding consensus motifs for an RNA-dependent RNA polymerase and an ATPIGTP binding site, ~%%%%, I ress Locating a nucleotide bi ding site in the thymidine kinase of vaccinia virus and of herpes mplex virus by scoring triply aligned protein sequences. hoc Two related supedemilies of putative helicases involved in replication, recombination, repair and expression of DNA and RNA genomes Hepatitis E virus: Cloning of a Mexican strain and comparison to the Burmese strain Primary structural comparison of RNA-dependent polymerases from plant, animal, and bacterial viruses Study of an epidemic of non-A, non-B hepatitis: Possibility of another human hepatitis virus distinct from posttransfusion non-A, non-B type The detection and classification of membrane-spanning proteins Enterically transmitted non-A, non-B hepatitis: Identification of virus-associated antigen in experimentally infected cynomolgus macaques A cDNA fragment of hepatitis C virus isolated from an implicated donor of post-transfusion non-A, non-B hepatitis in Japan An assay for circulating antibodies to a major etiologic virus of human non-A, non-B hepatitis A cDNA clone closely associated with non-A, non-B hepatitis Analysis of transcriptional regulatory signals of the HSV thymidine kinase gene: Identification of an upstream control region Organization of plant virus genomes that comprise a single RNA molecule Specific synthesis of DNA in vitro via a polymerase catalysed chain reaction. ln Nucleotide sequence of a region of the feline calicivirus genome which encodes picornavirus-like RNA-dependent RNA polymerase, cysteine protease and 2C polypeptides Eukaryotic dinucleotide preference rules and their implications for degenerate codon usage Sequence studies of several alphavlrus genomic RNAs in the region containing the start of the subgenomic RNA Long-incubation post-transfusion hepatitis without serological evidence of exposure to hepatitis B virus Enterically transmitted non-A, non-B hepatitis: epidemiology and clinical characteristics Molecular biology of non-A, non-B hepatitis agents: the hepatitis C and hepatitis E viruses Molecular Cloning of the hepatitis E Virus. /n "Viral Hepatitis and Liver Disease Hepatitis E virus (HEV): Epitope mapping and detection of strain variation Hepatitis E Virus (HEV): The novel agent responsible for enterically transmitted non-A, non-B hepatitis Isolation of a cDNA from the virus responsible for enterically transmitted non-A, non-B hepatitis Nucleotide sequence of yellow fever virus: Implications for flavivirus gene expression and evolution Enzymatic amplification of b-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia DNA sequencing with chain terminating inhibitors. froc Method to determine the reading frame of a protein from the purinelpyrimidine genome sequence and its possible evolutionary justification. hoc The organization and expression of coronavirus genomes Evolution of RNA viruses The three viruses of non-A, non-B hepatitis Construction of cDNA libraries from small numbers of cells using sequence independent primers. Nucleic Acids ffes Epidemic and endemic hepatitis in India: evidence for a non-A, non-B hepatitis virus aetiology Hepatitis E virus: Identification of type common epitopes, J Viral We thank Dr. Michael Lovett for his review and comments on this manuscript and appreciate the expert assistance of R. Cuevas, J. Fernandez, and the Genelabs Visual Arts Department.