key: cord-0752815-5rovh1xf authors: Godeny, E. K.; Speicher, D. W.; Brintow, M. A. title: Map location of lactate dehydrogenase-elevating virus (LDV) capsid protein (Vpl) gene date: 1990-08-31 journal: Virology DOI: 10.1016/0042-6822(90)90546-4 sha: 75e66da0fccbae4e094e2ff72e3fff10dc9c3be9 doc_id: 752815 cord_uid: 5rovh1xf Abstract Lactate dehydrogenase-elevating virus (LDV) is currently classified within the Togaviridae family. In an effort to obtain further information on the characteristics of this virus, we have begun to sequence the viral RNA genome and to map the virion structural protein genes. A sequence of 1064 nucleotides, which represents the 3′ terminal end of the genome, was obtained from LDV cDNA clones. A 3′ noncoding region of 80 nucleotides followed by two complete open reading frames (ORFs) were found within this sequence. The two ORFs were in different reading frames and overlapped each other by 11 nucleotides. One ORF encoded a protein of 170 amino acids and the other ORF, located adjacent to the 3′ noncoding region of the viral genome, encoded a 114 amino acid protein. Thirty-three N-terminal residues were sequenced directly from purified LDV capsid protein, Vpi, and this amino acid sequence mapped to the ORF adjacent to the 3′ noncoding region. The presence of overlapping ORFs and the 3′ terminal map position of Vpi indicate that LDV differs significantly from the prototype alpha togaviruses. The Togaviridae family consists of small, spherical, enveloped viruses with icosahedral nucleocapsids and single-stranded RNA genomes of positive polarity. Recent sequence analyses indicate that many of the viruses initially classified in this family differ from the prototype alpha togaviruses in their genome organizations and replication strategies (1) (2) (3) (4) . Lactate dehydrogenase-elevating virus (LDV) is currently classified as a togavirus ( 1). The LDV particle has a diameter of 50-55 nm and the diameter of the nucleocapsid has been estimated to be 30-35 nm (5, 6). The genome of LDV is a single-stranded RNA molecule of positive polarity (5) which contains a poly(A) tract at its 3' terminus (7, 8). The estimated molecular weight of the genome is 5 X 1 O6 Da (5, 9) . LDV particles are composed of at least three structural proteins: the capsid protein, Vpl , with a molecular weight of 15,000 Da; a nonglycosylated envelope protein, Vp2, with a molecularweight of 18,000 Da; and an envelope glycoprotein, Vp3, which exhibits a heterogeneous migration pattern on SDS-PAGE with an estimated molecular weight range between 24,000 and 44,000 Da (5, 10). It is not known whether the Vp3 region on the gel contains more than one protein or various differentially glycosylated forms of the same protein. Although in vivo replication of LDV is highly efficient, it is difficult to produce virus in tissue culture; no cell line has yet been found which can efficiently support LDV replication. LDV replicates in primary murine cell cultures containing macrophages (1 I), but only a small ' To whom reprint requests should be addressed. subpopulation (6-20%) of cells in these macrophage cultures are permissive for LDV replication (12). Since such a small proportion of cells are infected in the macrophage cultures, it has not been possible to detect intracellular viral components in cell culture extracts. Because of the technical difficulties inherent in studying LDV replication in tissue culture, neither the gene order nor the replication stategy of this virus have yet been delineated. In order to obtain the information needed to definitively classify LDV, we have begun to map the structural proteins of the neurotropic isolate of LDV (LDV-C; 13) on the viral genome. LDV-C (approximately 2 X 10" IDS,,) was purified from blood plasma taken from 50 CD-1 mice (Charles River Breeding Laboratories, Boston, MA) as previously described (8). The viral structural proteins were separated by SDS-PAGE and transferred electrophoretitally to a polyvinylidene difluoride (PVDF) membrane (Millipore Corp., Bedford, MA) using a modified Towbin Tris-glycine buffer (12.5 mM Tris, 96 mAI glycine, pH 8.3) containing no methanol (14). After staining with Coomassie blue, the Vpl protein band, which migrated with an apparent molecular weight of 14,000 Da (data not shown), was excised from the PVDF membrane and analyzed in an Applied Biosystems Model 475A protein sequencer. Automated protein sequence analysis was performed in the gas phase mode with on-line PTH analysis using a Model 120A analyzer as previously described (15). The following N-terminal 33 amino acid sequence was obtained for Vpl : Copyright 0 1990 by Academic Press, Inc All rights of reproduction I" any form reserved. No homologous sequence was found in a search of the National Biomedical Research Foundation Protein Sequence data base. We previously reported an LDV-C cDNA clone, dt4, which was synthesized by oligo-deoxythymidine priming of the viral RNA and represents the 3' terminus of the LDV-C genomic RNA (8). The genomic RNA of LDV-C was recloned as previously described (8) using calf thymus (ct) pentameric DNA for priming. Both strands of the double-stranded cDNA clones were sequenced by the dideoxy chain termination method as previously described (8) until complementary overlapping sequences were obtained within each clone. As shown in Fig. 1 , four ct-primed clones (b24, b63, b104, and ~44) were found to contain long poly(A) tracts at one end. These four clones completely overlap the unique sequence of the dt4 clone. The longest poly(A) tract found among these clones was 52 nucleotides in length which is very close to the length of the 3' terminal poly(A) tract (approximately 50 nucleotides) previously estimated directly from the LDV genomic RNA (7). Two additional clones, b90 and a16, further extended the 5'end of this sequence (Fig. 1) . The sequence obtained from the seven DNA clones extends 1064 nucleotides beyond the 3' terminal poly(A) tract of the LDV-C genome. This sequence, which has been converted to the viral RNA sequence, is shown in Fig. 2 . Because of the high mutation rate characteristic of RNA virus replication, multiple clones were sequenced in order to obtain the majority nucleotide for any given position. With the exception of the reported change in the dt4 clone (8) at position 976 of this sequence, identical nucleotide sequences were found in all clones at all positions represented by three or more clones. Although the regions between nucleotides 89 and 207,292 and 421, and 565 and 657 were each sequenced from only two overlapping clones, the sequences obtained were identical except at position 31 1. At position 31 1, clone b63 contained uridine, whereas clone b90 contained cytosine (Fig. 2) . The regions between nucleotides 1 through 88, 208 through 291, and 422 through 564 have not yet been confirmed in overlapping clones. When this 1064 nucleotide sequence was translated using the Sequence Analysis Package of the Wisconsin Genetics Computer Group (16) two complete ORFs were found in different reading frames (Fig. 2) . No ORF of significant size was found in the third reading frame. One ORF begins with a start codon (AUG) at nucleotide 135 and ends with an ochre termination codon (UAA) at nucleotide 648. This ORF encodes a protein of 170 amino acids (denoted as VpX, Fig. 2 ). The nucleotide ambiguity at position 31 1 does not change the encoded amino acid. Although the identity of the encoded protein is not yet known, the amino acid sequence of this protein does not contain potential Nlinked glycosylation sites and is not sufficient in length to be the gene encoding the envelope glycoprotein Vp3. We have not yet obtained sufficient data to determine whether this ORF encodes the Vp2 protein. The 5' end of the second ORF overlaps the 3' end of the ORF described above by 11 nucleotides and is in a different reading frame (Fig. 2) . The sequence of the overlap region between these two ORFs was confirmed in four clones. The second ORF begins with a start codon at position 637 and ends with a single termination codon (UAG) at nucleotide 982. The 3' noncoding region of the LDV genomic RNA is 80 nucleotides in length (Fig. 2) . The N-terminal amino acid sequence obtained directly from the LDV-C Vpl protein was found to map to the 5' terminus of the second ORF (Fig. 2) . This ORF encodes a 1 14 amino acid protein which would have a molecular weight of approximately 12,200 Da. The estimated molecular weight of Vpl is 15,000 Da (5, 10). The amino acid sequence of Vpl indicates that, like other RNA virus capsid proteins, the LDV capsid protein is a basic protein: 16% of the residues in this sequence are basic amino acids (lysine or arginine) at pH 7, while only 3% of the residues are acidic (aspat-tic acid or glutamic acid). Consistant with its amino acid composition, Vpl migrated to the upper pH range (pH 2 8; data not shown) when electrophoresed on a twodimensional gel (17). Another partial ORF is located at the 5' end of the nucleotide sequence. This ORF is at least 144 nucleotides in length and is in a different frame from the adjacent ORF, but in the same frame as the capsid protein ORF. The 3'terminus of this ORF overlaps the 5' end of the adjacent ORF by 10 nucleotides (Fig. 2) sequence of the 5' end of this ORF is incomplete, we do not yet know the length nor identity of the protein encoded by this ORF. There is one potential N-linked glycosylation site near the 5' end of this partial ORF. Although morphologically similar to the togaviruses and flaviviruses, the presence of multiple ORFs in the LDV genome and the 3'terminal location of the capsid protein gene suggest that LDV is neither a togavirus nor a flavivirus. The genomes of the viruses within these two families contain one ortwo long ORFs which encode polyproteins and the capsid protein genes of these viruses are located at the 5' ends of their respective ORFs. The genome structure of LDV resembles that of equine arteritis virus (EAV), which is the only member of the genus arteriviruses within the Togaviridae family (1) . Although this virus is still classified as a togavirus, it differs significantly from the alpha and rubi togaviruses: (a) the EAV proteins are encoded by multiple ORFs; (b) the capsid protein gene maps to the 3'terminus of the coding region on the genome (4); and (c) the EAV proteins are translated from six subgenomic mRNAs (19) . The presence of overlapping reading frames each beginning with a start codon suggests that LDV proteins are also translated from subgenomic mRNAs. However, due to the difficulty of detecting intracellular viral components in LDV-infected tissue culture extracts, no LDV subgenomic mRNAs have yet been observed. EAV and LDV are also morphologically similar in both virion size and nucleocapsid structure. Although the LDV structural proteins are similar in size to those of EAV (20), no serologic cross-reactivity has been found between these two viruses (2 1). The LDV genome structure is also similar to that of the coronaviruses (18). However, LDV has an icosahedral nucleocapsid, whereas the coronaviruses have helical nucleocapsids. Nonetheless, the properties of the LDV genome suggest that LDV belongs to a recently proposed virus superfamily (4) consisting of the coronaviruses, the toroviruses, and the arteriviruses. Further characterization of LDV is necessary to facilitate the classification of this virus and to determine the degree of similarity it shares with EAV and the coronaviruses. This work was supported by Public Health Service Grant NS 19013 from NINCDS. We thank Michelle Gonda and Janice Dispotofortechntcal assistance. We also thank Kaye Spelcher, Kevin Beam, and Clement Purcell in the Protein Microchemistry Facility at the Wistar lnstltute for protein sequence analysis and technical assistance. New Aspects of Positive Strand RNA Viruses Proc. Nat/. Acad. SC;. USA 76 Techniques in Protein Chemistry The Togaviruses: Biology, Structure, Replication