key: cord-0005216-n901y2d4 authors: ZHANG, Feiyun; TORIYAMA, Shigemitsu; TAKAHASHI, Mami title: Complete Nucleotide Sequence of Ryegrass Mottle Virus : A New Species of the Genus Sobemovirus date: 2001 journal: J DOI: 10.1007/pl00012989 sha: 327dba50d84f585f0473a4f31ef7932906cba7bf doc_id: 5216 cord_uid: n901y2d4 The genome of Ryegrass mottle virus (RGMoV) comprises 4210 nucleotides. The genomic RNA contains four open reading frames (ORFs). The largest ORF 2 encodes a polyprotein of 947 amino acids (103.6 kDa), which codes for a serine protease and an RNA-dependent RNA polymerase. The viral coat protein is encoded on ORF 4 present at the 3′-proximal region. Other ORFs 1 and 3 encode the predicted 14.6 kDa and 19.8 kDa proteins of unknown function. The consensus signal for frameshifting, heptanucleotide UUUAAAC and a stem-loop structure just downstream is in front of the AUG codon of ORF 3. Analysis of the in vitro translation products of RGMoV RNA suggests that the 68 kDa protein may represent a fusion protein of ORF 2-ORF 3 produced by frameshifting. The protease region of the polyprotein and coat protein have a low similarity with that of the sobemoviruses (approximately 25% amino acid identity), while the RNA-dependent RNA polymerase region has particularly strong similarity (54 to 60% of more than 350 amino acid residues). The sequence similarities of RGMoV to the sobemoviruses, together with the characteristic genome organization indicate that RGMoV is a new species of the genus Sobemovirus. Ryegrass mottle virus (RGMoV) was first isolated from stunted Italian ryegrass (Lolium multiflorum) and cocksfoot (Dactylis glomerata) having mottling and necrotic symptoms on leaves"). The isometric particle, 28 nm in diameter contains a species of single-stranded RNA with a molecular weight of 1.5 X lo6. The physical properties and some biological ones of the virus are similar to Cocksfoot mottle virus (CfMV), which is prevalent in cocksfoot pastures in Japanlg). However, RGMoV is serologically distinct from CfMV, Cocksfoot mild mosaic virus, Cynosurus mottle virus and Phleum mottle virus, which occur in European countries2'). Last year, an isometric virus, isolated from Italian ryegrass in Germany was found to be serologically related to RGMoV ; in agar gel double diffusion tests, a spur formed between RGMoV and the Germany isolate (Frank Rabenstein, Germany ; personal communication). In spite of serological differences between RGMoV and sobemoviruses20), general properties of RGMoV are similar to those of grass viruses that belong to sobemo-viruses4). The genome sequence of sobernoviruses has been determined in Southern bean mosaic virus (SBMV)'2,24), CfMV8315), Rice yellow mottle virus (RYMV)") and Lucerne transient streak virus (LTSV, accession number U31286). The genomic RNA of sobemoviruses is a single-stranded molecule, approximately 4100 to 4500 nucleotides (nt) in size. The 5' terminus has a genome-linked viral protein (VPg) and the 3 end does not have a poly (A) tail. The genome encodes four ORFs : the largest ORF encodes the polyprotein of approximately 100 kDa, which contains protease and RNA polymerase motifs. Only the polyprotein of CfMV is encoded by two smaller overlapping ORFs, by -1 frame shifting7). Recently, we determined the complete nucleotide sequence of the Japanese isolate of CfMV (CfMV/JP) (Zhang and Toriyama, unpublished data ; accession number AB040447). The nucleotide sequence is 96.8% identical to the Norwegian isolate of CfMV(CfMV/ NO)') and 95.8% identical to the Russian i~olate'~). Its genome organization is identical to that of CfMV. So far, the genome sequence of RGMoV and the Germany isolate has not been determined, so the genus is still unknown. In this paper, we report the complete nucleotide sequence of RGMoV and compare it to that of the sobemoviruses. Ryegrass mottle virus (RGMoV) was propagated in barley plants (cv. Shunsei) and purified as described previously2o). A purified preparation of CfMVl JP, was stored at -80"C'9) and used for the in uitro translation experiment. In a preliminary experiment, we found that RGMoV RNA does not have a poly (A) tail at the 3'-terminus. Thus, we determined the 3'-terminal sequence by two-dimensional mobility shift analysis as described previ~usly'~). In this experiment, the homomix (alkaline digested yeast RNA mixture) was prepared by using the RNA from Torula utilis, a product of Fluka (Riedel-de Haen; Seelze, Germany). The 5'terminal sequence of RGMoV RNA was identified by sequencing the PCR clones amplified by using the 5' RACE abridged anchor primer system (Gibco BRL, Gaithersburg, USA). cDNA synthesis was done as described previously2') using M-MLV reverse transcriptase (Gibco BRL), random hexanucleotide primer and synthetic oligonucleotide primer (Pl), -6'-ACTAGTCGACACGAAAACCCC-3' : the sequence at the 3' end underlined was analyzed by twa-dimensional sequence analysis. The synthesized second strand cDNA was blunt-ended with T4 DNA polymerase and ligated into SmaI-digested pUC18. Recombinant plasmids were transformed into competent Escherichia coli DH5a (TOYOBO, Osaka, Japan). The cDNA clones shown in Terminal sequence of viral RNA Cloning and DNA sequence Fig. 1 were made by primer extension and PCR amplification and used for sequencing of RGMoV RNA. The ambiguous nucleotide sequence was confirmed by using PCR clones prepared independently (not shown in Fig. 1 ). Nucleotide sequences were determined using the Pharmacia DNA sequencing kit and an ALFred DNA sequencer (Pharmacia, Uppsala, Sweden). The sequence data were assembled and analyzed using the DNASIS (Macintosh) program (Hitachi Software Engineering Co., Yokohama, Japan). GeneBankIEMBL, NBRF and PIR databases were searched for nucleic acid and amino acid sequence identity. Cell-free translation In uitro translation using wheat germ extract (Promega, Madison, USA) was performed as described by the manufacturer's manual in a final volume of 50 ,ul in the presence of redivue L-[35Sl methionine (Amersham Pharmacia Biotech, Buckinghamshire, UK) for 1 hr at 25°C. Translation products were separated by SDS-PAGE (10% polyacrylamide) and detected using a Molecular Imager System (BioRad, Richmond, USA). A set of prestained SDS-PAGE standards (BioRad) was used as protein size markers. Purified RGMoV was electrophoresed on 10% polyacrylamide-SDS gels and electro-blotted onto the PVDF membrane (Immobilon-PsQ ; Millipore, Middlesex, UK). The portion corresponding to the coat protein on the PVDF membrane was excised, and the N-terminal sequence of the coat protein was analyzed using a gas-phase protein sequencer (model 477A/120A, Applied Biosystems, Foster City, USA). Nucleotide sequence and genome organization The complete nucleotide sequence of RGMoV com- a) References of sequence data : SBMV (M23021), LTSV (U31286), RYMV(L20893) and CfMV (248630). b) The percentage values indicate the identity over the stretch of amino acid residues indicated in parentheses. c) This similarity was found between the N-terminal region of the 56.3 K ORF of CfMV (refer to Fig. 2) . prises 4210 nt with a base composition of 24.3% A, 22.2% U, 25.2% C and 28.3% G. The G S C content is 53.5%. The sequence contains four major ORFs flanked by 5'-and 3-untranslated sequences of 99 and 198 nt, respectively. Database searches indicated that the genome sequence of RGMoV is significantly similar to that of sobemoviruses, for which the genome organization is summarized in Fig. 2 . As shown in Fig. 2 The largest ORF 2 extends from nucleotides 643 to 3486. The predicted 103.6 kDa protein consists of 947 amino acids. Database searches revealed a significant similarity to the polyproteins of sobemoviruses ; SBMV (accession number, M23021)24), RYMV (L20893)"), CfMV (248630)*) and LTSV (U31286). The polyprotein of RGMoV contains serine and P3C ~r o t e a s e s~,~~) and an RNA-dependent RNA p01ymerase'~) (Fig. 3) . A conserved sequence, GxPxFDPxYG*), is found in the N-terminal region (amino acids 70 to 90 residues) of the 103.6 kDa polyprotein. The protease motif appears immediately downstream of the conserved sequence : serine protease, in amino acids 148 to 220 from N-terminus and P3C protease, in amino acids 272 to 300 (Fig. 3) . The serine protease motif is well conserved between RGMoV, sobernoviruses and polioviru~~,~~). In addition, the P3C protease motif ... xGxS . /C . GxxxxxxxxGxxxxGxH* ... (the catalytic amino acid residue is marked with asterisks), is present just downstream. However, instead of serine (S*) or cysteine (C*), alanine is found in RGMoV. Thus, it is uncertain whether the P3C protease domain is catalytic in RGMoV or not. The RNA-dependent RNA polymerase is encoded near the C-terminal region of the polyprotein. This region showed very strong similarity, 54 to 60% identity over a 350 amino acid stretch ( Table 1) . The RNA polymerase motifs13) are distributed between amino acids 680 to 810. The sequence of this domain is conserved in particular, with approximately 75% identity between RGMoV and sobemoviruses. Database searches also showed that the sequence of RGMoV polymerase is highly conserved between the RNA polymerases of Beet mild yellowing virus (S65829), Cucurbit aphid-borne yellowing virus (X76931)) Potato leaf roll virus (X74789) and Barley yellow dwarf virus (L25299) of the family Luteoviridae. The similarity is approximately 50% identity over a 240 amino acid stretch, suggesting an evolutionary close relationships between RGMoV, sobemovirus and luteovirus (subgroup 11)'O). Van der Wilk et a1.") found that the VPg of SBMV is encoded by ORF 2, downstream of the protease domain and in front of the RNA polymerase. We compared the amino acid sequence similarity between the VPg region of SBMV ORF 2 and the corresponding region of RGMoV ORF 2. The search revealed no significant similarity. Sequence diversities in the VPg regionzz) are also shown between SBMV, CfMV and RYMV. However, the con-served sequence, WAG + E/D rich sequence is detected in the region, and putative E/S cleavage sites are present on both sides of the region : proteolytic cleavage would result in a protein of 9 kDa. Possibly, the VPg of RGMoV is located between the protease and the RNA-dependent RNA polymerase domains in the same order as in the SBMV ORF 222) (Fig. 3) . RGMoV ORF 3 is completely within the ORF 2. The predicted 19.8 kDa protein has distinct similarity, 40% identity to the corresponding ORFs of SBMV and LTSV. However, it is unknown whether the 19.8 kDa protein is independently translated in vivo, because ORF 3 may be expressed as a fusion protein as will be discussed. ORF 4 comprises 198 amino acids encoding a 25.6kDa coat protein. The 16 amino acid sequence of the N-terminus of the viral coat protein was identical to that deduced from the ORF 4 nucleotide sequence (data not shown). Sequence similarity searches indicated that the RGMoV coat protein revealed a weak but significant similarity, 24 to 27% identity with that of SBMV, LTSV and RYMV, but only 15% identity with CfMV (Table 1) . In the wheat germ extract system, RGMoV RNA directs the synthesis of two products of 103 kDa and 68 kDa, but no other distinct product was detected. In contrast, the translational products synthesized in vitro with CfMV/JP RNA are four major proteins with sizes almost identical to those previously reported for CfMV/ NO'*) (Fig. 4) . The translational activity of CfMV/JP RNA was low in our present system, as reported for other sobemoviruses'6). RGMoV RNA is a poorer message in our wheat germ extract system. The largest product of RGMoV RNA was 103 kDa and seems to be derived from the largest ORF 2 for the polyprotein. In the RGMoV RNA sequence, no ORF corresponds to the second largest product of 68 kDa. The putative replicase of CfMV is translated as part of a single polyprotein by -1 ribosomal frameshifting between two overlapping ORFs having a coding capacity for 60.9 kDa and 56.3 kDa proteins7J8). Translational frameshifts are known in coronavirus IBV2), polymerase genes of r e t r o~i r u s e s~'~) and plant viruses7+'). As consensus signals for frameshifting, the heptanucleotide sequence (e.g., UUUAAAC sequence) and the stem-loop structure immediately downstream have been proposed by Jacks et aL5). As found in CfMV, SBMV and RYMV7), identical signals are found in RGMoV RNA just preceding the initiation codon of the ORF 3 (Fig. 5 ). Tamm et aL1*) proposed a possible mechanism that the 70 kDa in uitro translation product of SBMV and RYMV RNAs may represent the OW 2-ORF 3 transframe fusion protein. Thus, the 68 kDa translational product of RGMoV RNA is probably derived from -1 ribosomal frameshifting (Fig. 3) , not from proteolytic cleavage of the p~lyprotein~~). In this experiment, we tried to detect the RGMoV coat protein in the in uitro translation products by immunoprecipitation. However, we could not detect any signal for the coat protein. The coat protein of SBMV is translated only from a smaller, subgenomic RNA, which is detected in virus-infected tissues as well as virus particle^'^). As smaller RNAs were not detectable in our RGMoV RNA preparation, the amount of subgenomic RNA, if any, may have been insufficient for the detection of the in vitro translated coat protein. We conclude that RGMoV is a member of the genus Sobernovirus based on sequence similarities. The similarity level of nucleic acid (approximately 50% identity) and protein (Table 1) is low enough for virus species demarcation between any species of sobemoviruses, whereas the genome organization of RGMoV is closely related among sobemoviruses. Biological and serological properties of RGMoV are distinct from those of other characterized grass viruseszo). Thus, RGMoV is a unique species of the genus S o b e r n o v i r~~~,~~) . The polyprotein gene organization of RGMoV is the same as that of SBMV, RYMV and LTSV, but different from that of CfMV, for which a polyprotein is produced as a single fusion-protein by the frameshifting of two ORFs7). Expression of rice yellow mottle virus P1 protein in vitro and in vivo and its involvement in virus spread An efficient ribosomal frame-shifting signal in the polymerase-encoding region of the coronavirus IBV Sobemovirus genome appears to encode a serine protease related to cysteine proteases of picornaviruses Genus sobemovirus Signals for ribosomal frameshifting in the rous sarcoma virus gag-pol region Characterization of ribosomal frameshift in HIV-1 gag-pol expression The putative replicase of the cocksfoot mottle sobemovirus is translated as a part of the polyprotein by -1 ribosomal frameshift Sequence and organization of barley yellow dwarf virus genomic RNA Luteovirus gene expression genome characterization of rice yellow mottle virus RNA Nucleotide sequence of the bean strain of southern bean mosaic virus Identification of four conserved motifs among the RNA-dependent polymerases encoding elements Messenger RNA for the coat protein of southern bean mosaic virus Nucleotide sequence of RNA from the sobemovirus found in infected cocksfoot shows a luteovirus-like arrangement of the putative replicase and protease genes Translation of southern bean mosaic virus RNA in wheat embryo and rabbit reticulocyte extracts Complementarity between the 5'-and 3'-terminal sequences of rice stripe virus RNAs Identification of genes encoding for the cocksfoot mottle virus proteins Cocksfoot mottle virus in Japan Ryegrass mottle virus, a new virus from Lolium multiflorum in Japan Nucleotide sequence of RNA 1, the largest genomic segment of rice stripe virus, the prototype of the tenuivirus The genome-linked protein (VPg) of southern bean mosaic virus is encoded by the ORF2 Guidelines to the demarcation of virus species Sequence and organization of southern bean mosaic virus genomic RNA Evolution of RNA viruses The nucleotide sequence data reported in this paper have been submitted to DDBJ, EMBL and GenBank under the accession number AB040446. National Institute of Agro-Environmental Sciences, Tsukuba 305-8604, Japan Present address : Tokyo University of Agriculture and Technology, United Graduate School of Agriculture, Fuchu 183-8509, Japan Present address : Tokyo University of Agriculture, Sakuragaoka 1, Setagaya-ku, Tokyo 156-8502, Japan We wish to thank the late Professor Dr. D. Hosokawa, Tokyo University of Agriculture and Technology, for his encouragement and Dr. T. Teraoka for his help with the amino acid sequence analysis.