key: cord-0004894-vnz15x84 authors: Chen, H. H.; Stark, C. J.; Atreya, C. D. title: The rubella virus nonstructural protease recognizes itself via an internal sequence present upstream of the cleavage site for trans-activity date: 2006-03-27 journal: Arch Virol DOI: 10.1007/s00705-006-0744-9 sha: 7b242f0ac81dd1f79d0c471df13f90426b0383bd doc_id: 4894 cord_uid: vnz15x84 The substrate requirement for rubella virus protease trans-activity is unknown. Here, we analyzed the cleavability of RV P200-derived substrates varying in their N-terminal lengths (72–475 amino acids) from the cleavage site by the RV protease trans-activity. Only substrates with at least 309 amino acid residues N-terminal to the cleavage site were able to undergo cleavage. Further, rubella sequence was found to be necessary in the N-terminal region of the substrate, whereas a heterologous sequence C-terminal to the cleavage site was tolerated. These results demonstrated a requirement for residues located between amino acids 994–1102 of the RV P200 polyprotein, besides its cleavage site for RV protease trans-activity. This region overlaps with the starting site of the essential cis-protease activity of RV P200 polyprotein. This is a novel observation for a viral protease of the family Togaviridae. A number of positive-stranded RNA viruses employ viral nonstructural polyprotein processing as a strategy for genome expression by encoding their own (viral) proteases [5, 7, 14] . Thus, in order to achieve the expression of multiple proteins from a single message that are essential for regulation of viral replication and biogenesis, proteolysis of the viral polyprotein precursor is an essential event in most of the positive-strand RNA viruses [14, 16] . This is in contrast to the mRNAs of their eukaryotic host cells, which mostly code for single proteins [16] . In alphaviruses and rubella virus, members of the family Togaviridae, two polyproteins are expressed: the nonstructural polyprotein, directly expressed from the genomic RNA, and another polyprotein, expressed from a sub-genomic mRNA synthesized during the viral infection [16] . Rubella virus is an enveloped virus of the genus Rubivirus in the family Togaviridae [2] . The virion has a 40S (9762 nucleotide long) single-stranded, positive-sense, polyadenylated RNA genome, which serves as the mRNA following a series of events that lead to the uncoating of the viral particle upon entry into cells [1, 2, 4] . In an infected cell, RV mRNA first translates into a P200 nonstructural polypeptide [2] . Newly synthesized P200 often undergoes ciscleavage at NH 2 -SRGG 1301/1302 G-COOH into two mature nonstructural proteins, P150 and P90, by a protease domain contained within the C-terminal region of P150 [3, 9, 11, 18] . The viral P200 nonstructural polyprotein contains four conserved functional domains that are involved in viral RNA synthesis and replication. They are sequentially located from the N terminus to the C terminus, as methyltransferase, protease, helicase, and RNA-dependent RNA polymerase domains based on bioinformatic analysis [4, 6] . The first two motifs are located on the N and C termini of P150, and the latter two are on the N and C termini of P90 [5, 13] . Rubella virus P150-protease was identified to be a papain-like-cysteine protease [5] . It utilizes divalent cations and also demonstrates trans-cleavage activity on homologous substrates in vitro [9] as well as in vivo [17] . Further characterization of RV protease revealed that it exhibits zinc-binding activity as integral to its protease activity, thus suggesting that it is a novel viral metalloprotease [9, 10] . Within the C-terminal half of P150, Cys-1152 and His-1273 are the catalytic sites for this protease [3, 11] . Domains of RV protease required for cis-and trans-activity are mapped [8, 11] . These studies have thus far focused on defining the catalytic sites, active domains, and cis-and trans-activity requirements of RV protease [8] [9] [10] [11] 17] . Experimental evidence suggests that P200 is actively involved in the synthesis of viral negative-strand RNA, and its cleavage into P150 and P90 has been suggested to switch the complex to initiate positive RNA synthesis [8] . This leads to the logical hypothesis that, in the replication complex, those P200 molecules engaged in negative-strand RNA synthesis must remain as P200 (i.e., lose their cis-protease activity) as long as they are required to participate in negativestrand RNA synthesis. Subsequently, the P150-protease trans-activity cleaves these P200 molecules to signal the replication complex to switch to positivestrand RNA synthesis mode. However, besides the presence of the cleavage site, what the internal sequence requirements of the P200 and its intermediaries for RV protease trans-cleavage activity are, and how the cis-activity of P200 is regulated to maintain optimal negative RNA synthesis have not been identified. In this report, we analyzed the substrate features that are required of the RV protease trans-activity by using a truncated version of RV P200 (amino acid residues 827-1548) that was previously shown to function as protease but had lost its cis-activity due to a substitution, G1301S, in the cleavage site, and a series of P200-derived substrates that lack protease activity due to an amino acid substitution, C1152S, in the catalytic dyad [8] . Our analysis identified a region in the substrate corresponding to residues 994-1102 of the P200 polyprotein that is required for the protease trans-activity, and this region overlaps with the starting site of the essential cis-protease activity of RV P200 polyprotein. Potential implications of the requirement of the cis-activity region of P200 substrates for the RV protease trans-activity are discussed. Plasmids capable of expressing functional RV protease and P200-derived substrates in mammalian cells were constructed as follows: Appropriate cDNA segments representing various RV genomic regions were PCR-amplified and inserted into a mammalian expression plasmid, pcDNA4-His/Max-C, driven by a CMV promoter. This plasmid also contains an in-frame Xpress epitope tag for easy identification of the expressed protein by anti-Xpress monoclonal antibody in immunobloting analysis (Invitrogen Inc, CA). Two independent DNA preparations for each plasmid were sequenced to ensure authenticity of the plasmids used in the study. RV trans-cleavage active protease-encoding cDNA was amplified by PCR with a pair of primers (Table 1) , using a previously well-established infectious RV cDNA template, pBRM33-G1301S, which contains an active protease domain (catalytic site, C1152) but lacks the cis-cleavage capability due to a substitution at residue G1301 within the cleavage site (NH 2 -SRGG/G-COOH) to S1301 (NH 2 -SRGS/G-COOH) [8] . The amplified DNA product was cloned at the BamHI and EcoRI sites in the expression plasmid described above. In this construct, in-frame with the protease sequence C-terminus, a green fluorescent protein (GFP) ORF was also inserted, so that the expressed protease would have a higher molecular weight to distinguish it from its substrate in immunoblot analysis. This expression plasmid was designated as pRVP (Fig. 1A) . To evaluate the trans-cleavage activity of RV protease on P200-derived substrates, a series of RV P200-related polypeptide expression plasmids that express protein with different N-terminal lengths from the cleavage site (72, 199, 309, and 475 residues) were constructed. This was achieved by using a previously well-established RV cDNA template, pBRM33-C1152S. The template contains RV cDNA with the cleavage site (N -SRGG/ G-C ) unmodified but lacks the protease activity due to a substitution at the catalytic activity residue (C1152S) of the enzyme [8] . Desired DNA segments of pBRM33-C1152S template with the same C-terminus (amino acid position 1548) but varying in length at the N-terminal end were amplified using appropriate sets of primers (Table 1 ) and cloned at the BamHI and EcoRI sites of the plasmid vector described above. The plasmids were designated, pRVS-827-1548, pRVS-994-1548, pRVS-1102-1548, and pRVS-1228-1548, producing proteins with 475-, 309-, 199-, and 72-residue N-terminal lengths from the cleavage site, respectively (Fig. 1B) . To test whether replacement of P200-related sequences C-terminal to the cleavage site by non-rubella sequences affect RV protease trans-activity, the plasmid pRVS-GFP, which contains PCR-amplified RV protease substrate sequences representing polypeptide residues 827-1306 (residues 1302-1306 represent the C-terminal side of the cleavage site), fused at its C-terminus to GFP ORF in-frame, was created (Fig. 1B) . Similarly, to test whether a heterologous sequence on the N-terminal side of the cleavage site affects substrate recognition by the protease trans-activity, GFP ORF was fused in-frame at the N-terminus of the rubella sequence in the plasmid pRVS-1102-1548 to create pRVS-GFP-1102-1548 (Fig. 1B) . The template for the PCR in both cases was pBRM33-C1152S. (A) Shows pBRM33-G1301S-derived P200 that was used as a PCR-template to clone the trans-active protease region. The P150 and P90 domains of P200 are indicated. The protease catalytic residue, Cys at amino acid 1152 (C1152) and a cleavage site substitution mutation (Gly to Ser) at P150 C-terminal residue, 1301 (G1301S, represented by 'X' in the P150 box and by a black circle on the top) are as shown. The mutation eliminates the cis-activity of the protease, leaving its trans-activity intact [8] . Numbers (1-2115) on the side represent amino acid residues of P200. The trans-active protease domain spans from amino acid position 827 to 1301 [8] . Expression plasmid, pRVP contains this protease domain in-frame flanked by an N-terminal Xpress epitope tag (hatched box) and a C-terminal GFP ORF. (B) Shows pBRM33-C1152S-derived P200 used as a PCR-template to produce a series of deletion substrates. The P150 and P90 domains on P200 were as shown. To serve as homologous substrates, the protease activity was eliminated due to a substitution at the catalytic residue, Cys-1152 to a Ser [8] (C1152S, represented by 'X' and by an open circle on the top) but with an intact cleavage site at the P150 C-terminal residue, 1301 (G1301) as shown. Expression plasmids containing RV P200-related cDNA pRVS-827-1548, pRVS-994-1548, pRVS-1102-1548, and pRVS-1228-1548 (numbers denote position of amino acid residues within the P200 polyprotein representing the N and C termini of the substrate) used in the study were as illustrated. These plasmids express proteins with varying N-terminal lengths from the cleavage site (475, 309, 199 and 72 aa respectively as shown in parentheses under each construct box). Cleavage in the substrate occurs at amino acid residue G1301. In pRVS-GFP, the RV cDNA insert was amplified from position 827-1306 (5 residues upstream of the cleavage site) and a GFP-ORF was fused at its C-terminus, whereas in pRVS-GFP-1102-1548, the GFP-ORF was fused at the N-terminus of the 199 amino acid residue-substrate. All plasmids shown here contain an in-frame N-terminal Xpress epitope tag (hatched box) for immunoblot identification of the proteins RV-protease, and substrate plasmids were expressed in human embryonic kidney (HEK) 293T cells (ATCC, VA) by transient transfection using Lipofectamine-2000 reagent as per supplier's protocol (Invitrogen, CA). Following a 24 h incubation of cells at 37 • C in a CO 2 (5%) incubator, cell lysates were prepared and subjected to polyacrylamide gel electrophoresis (either 10% or, 4-20% gradient) with 0.1% sodium dodecyl sulphate, followed by immunoblot analysis as described [12] . Substrates as well as the protease all had an Xpress epitope tag at their N-terminus to detect the N-terminal side of the cleavage products. We evaluated the protease trans-cleavage activity on a series of RV P200-derived polypeptides in which the protease catalytic site at amino acid position 1152 was mutated (C1152S) to serve as substrate (i.e. to lack protease activity, but retain the cleavage site). HEK 293T cells were co-transfected with pRVP (Protease, Fig. 1A) , and each of the substrate expression plasmids, pRVS-827-1548, pRVS-994-1548, pRVS-1102-1548, and pRVS-1228-1548. Each substrate plasmid expressed RV P200-derived polypeptides of different N-terminal lengths (72-475 residues) from the cleavage site (Fig. 1B) . Following transfection, cell lysates were prepared, and the proteins were subjected to immunoblot analysis. The substrates as well as the protease all had an Xpress epitope tag at their N-terminus. Our analysis of RV protease substrates revealed that homologous polypeptides of N-terminal 475-and 309-residue length from the cleavage site, expressed from pRVS-827-1548 and pRVS-994-1548 (Fig. 2 , lanes 2 and 4, 80 and 67 kDa, respectively), were able to undergo cleavage when co-expressed along with RV protease (Fig. 2, lanes 1, 3, and 5 , 110 kDa) with an expected size of the N-terminal cleaved products of 55-and 37-kDa size (with additional Xpress epitope residues), respectively (Fig. 2, lanes 3 and 5) . However, sub- shown are longer exposures of the same blot to film. Note that substrates generated from pRVS-827-1548 and pRVS-994-1548 were cleaved by the protease (">" identifies the cleaved products of 475 and 309 residues respectively in 3 and 5), whereas, 199-and 72-residue (from the cleavage site) representing substrates expressed from pRVS-1102-1548 and pRVS-1228-1548 were not cleaved by the protease (7 and 9). Also note that the protease was able to cleave the RVS-GFP substrate with a GFP on the C-terminal side of the cleavage site to a lesser extent (<50%), as a significant portion of the substrate remained uncleaved strates of N-terminal 199-and 72-residue length from the cleavage site (plus the Xpress epitope residues), expressed from pRVS-1102-1548 and pRVS-1228-1548, respectively (Fig. 2, lanes, 6 and 8 , 55-and 37-kDa), did not undergo cleavage when co-expressed along with the protease (Fig. 2, lanes 7 and 9) as evidenced by the lack of Xpress-tagged N-terminal cleavage products detected in the immunoblot analysis. (If the substrates were cleaved, the expected sizes of the N-terminal products would be 25 kDa and 14 kDa, respectively.) This clearly suggests that amino acid residues residing between residues 200-308 in the substrate are required in order for the substrate to be targeted by the trans-activity of the protease. We extended our analysis to identify any homologous sequence requirement of the substrate on the C-terminal side of the cleavage site. To address this, we utilized the plasmid pRVS-GFP, which expresses an RV protease homologous substrate representing amino acids 827-1306 (residues 1302-1306 represent the C-terminal side of the cleavage site), fused at its C-terminus to a GFP ORF in-frame (Fig. 1B) . When this plasmid was co-transfected along with RV protease plasmid (pRVP) into HEK 293T cells, we observed that the substrate (Fig. 2, lane 10 , 80-kDa protein) was cleaved by the protease at the cleavage site, based on the expected molecular weight of the cleaved N-terminal product of 55 kDa (Fig. 2, lane 11) . This clearly suggests that the protease was able to cleave the RVS-GFP substrate with a GFP on the C-terminal side of the cleavage site, but to a lesser extent (<50%), as a significant portion of the substrate was reproducibly observed to be The fact that 309-substrate was cleaved and 199-substrate was not suggested that trans-protease processing of the substrate either requires a minimum N-terminal length from the cleavage site in the substrate, or it recognizes an internal amino acid sequence within the 309-substrate that is lacking in the 199-substrate. If indeed an internal domain within the 309-substrate is being recognized by the protease, then this sequence should be present in the N-terminal region of 309-substrate. To verify the above hypothesis, we utilized plasmid pRVS-GFP-1102-1548 (Fig. 1) , which expresses a chimeric substrate in which the 199-substrate was extended at its N-terminus by GFP to compensate for the length (increased from 199 to 479 amino acids) along with the protease plasmid, pRVP. Immunoblot analysis (Fig. 3 ) demonstrated that when the protease (Lane 1) and a positive control 475substrate (Lane 2) were coexpressed, the substrate did undergo cleavage (Lane 3). Although the N-terminal length was increased in the chimeric GFP-199-substrate (Lane 4), it still failed to undergo cleavage when coexpressed with the protease (Lane 5), suggesting that it is not the N-terminal length from the cleavage site that is required of the substrate, but in fact it is the internal sequence present within the substrate that is recognized by the protease for trans-processing. The internal recognition domain identified in this report corresponds to amino acid [8] . In the bottom panel, the protease substrate P200 is depicted with the location of the newly identified domain required for protease trans-activity (this report), which overlaps with the protease cis-activity starting site of P200 position 994-1102 of RV P200. This region overlaps with the N-terminal site of the essential cis-protease activity region on the P200 described by Liang et al. [8] . The location of the substrate recognition region (this report) relative to the essential cis-activity domain is illustrated in Fig. 4 . At this time we could not refine the recognition region further, as plasmids with deletions in this region of RV cDNA could not be rescued in bacteria. In this report, we defined the minimal substrate (template) for proteolytic processing in trans by the rubella virus nonstructural protease using an epitopetagged protein expression system. Results presented with RV protease (which has been functionally demonstrated to be similar to P150 activity [8] ) demonstrated that the protease trans-activity recognizes a region N-terminal to the cleavage site in the P200-derived substrates. Thus, the trans-protease activity requires not only specific amino acids present at or proximal to the cleavage site, but also sequences upstream at a distance from the cleavage site. We also demonstrated that, on the C-terminal side of the cleavage site, heterologous residues unrelated to RV are somewhat tolerated by the protease, but we routinely observed (in three separate experiments) that the efficiency of the protease trans-activity on this type of substrate was less than 50% (illustrated in Fig. 2, lane 11) . This clearly suggests that the P90 domain residing on the C-terminal side of the cleavage site in P200 does influence the trans-activity of the protease. In another positivestranded RNA virus, mouse hepatitis virus (MHV), one of the two virus-encoded proteases, PLP-1 (PCP), demonstrates a homologous substrate length requirement on the C-terminal side of the cleavage site and the cleavage efficiency increases with increasing substrate and enzyme polypeptide length, although in this case, the protease recognition sequence on the substrate was not identified [15] . The RV protease trans-activity-associated P200 internal sequence requirement, as identified here, is unique, and this is the first such report for the viruses that belong to the family Togaviridae. In this study, we mapped the regions of the protease substrate required for trans-activity, which is reciprocal to the Liang et al. study wherein they mapped the essential cis-and trans-activity regions of the protease itself and also demonstrated that the X domain (proline-rich, conserved in all M-group PCPs) present in the RV protease is important for the protease trans-activity [8] . They further speculated that this proline-rich X domain could serve as a protein-protein interaction domain that enhances the opportunity to meet its trans-cleavage substrate [8] . However, in this report, we clearly demonstrate by deletion analysis that the X domain in the substrate is dispensable for recognition by RV protease trans-activity and the actual recognition region is located downstream of the X domain, overlapping with the N-terminal starting-site (a term coined by Liang et al. [8] ) of the essential cis-activity domain of the substrate. Our results show that RV P150-associated protease trans-activity requires a specific region within the P200 that represents P150 itself (illustrated in Fig. 4 ). Taken together, our studies and those of Liang et al. [8] advance the field in enhancing our understanding of the molecular determinants that define the rubivirus protease trans-activity requirements that are essential for RV replication. In this report, we have shown that RV protease trans-activity demonstrates substrate specificity by requiring an internal sequence within the region that is N-terminal to the cleavage site. Identification of a region in the P200-related sequence-containing substrate (we utilized polypeptides from amino acid positions 827 to1548 of P200 or shorter) that is important for RV protease transactivity suggests that this region may offer a specific fold or a conformation to the P200 substrate to facilitate cleavage by the protease. Since most protease-substrate interactions involve transient binding of the protease to its cognate substrate, it is conceivable that RV protease trans-activity on homologous substrates also involves transient binding of protease to the substrate, and such binding perhaps could occur within the sequence that is required on the substrate for protease transactivity. We attempted to perform coimmunoprecipitation experiments to establish the protease-substrate binding following cotransfection of cells with both plasmid constructs, but failed to obtain reproducible results, perhaps due to the transient nature of the interaction. However, if this binding truly occurs in the viral infection cycle, then, as RV protease is recognizing itself as substrate in the essential cis-activity region for the trans-activity (illustrated in Fig. 4) , it is tempting to speculate that the trans-activity may be regulating the cis-activity of P200. This process could explain how P200 remains as P200 in the replication complex to initiate viral negative-strand RNA synthesis. As discussed in the introduction, previous experimental evidence suggested that P200 initiates the synthesis of viral negative-strand RNA, and its cleavage into P150 and P90 plays a critical role in switching the replication complex to the positive RNA synthesis mode [8] . In this context, our results leads us to speculate that perhaps the P150-protease transactivity requirement of the P200 sequence within its essential cis-activity region (illustrated in Fig. 4 ) could transiently stall the P200 cis-protease activity, provided that P150 binds to P200 in this region, to allow the negative-strand RNA synthesis to occur. This certainly is a verifiable future experimental direction to capture viral events that further enhance our understanding of the rubivirus proteases. Role of calreticulin in rubella virus replication Rubella virus Characterization of the rubella virus nonstructural protease domain and its cleavage site Molecular biology of rubella virus Putative papain-related thiol proteases of positive-strand RNA viruses. Identification of rubi-and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha-and coronaviruses Evolution and taxonomy of positive-strand RNA viruses: implications of comparative analysis of amino acid sequences Viral proteinases Rubella virus nonstructural protein protease domains involved in trans-and cis-cleavage activities The rubella virus nonstructural protease requires divalent cations for activity and functions in trans Characterization of the zinc binding activity of the rubella virus nonstructural protease Expression of the rubella virus nonstructural protein ORF and demonstration of proteolytic processing The N-terminal conserved domain of rubella virus capsid interacts with the C-terminal region of cellular p32 and overexpression of p32 enhances the viral infectivity Conservation of the putative methyltransferase domain: a hallmark of the 'Sindbis-like' supergroup of positive-strand RNA viruses Virus-encoded proteinases of the Togaviridae Expression of murine coronavirus recombinant papain-like proteinase: efficient cleavage is dependent on the lengths of both the substrate and the proteinase polypeptides Site-specific protease activity of the carboxyl-terminal domain of Semliki Forest virus replicase protein nsP2 Rescue of rubella virus replication-defective mutants using vaccinia virus recombinant expressing rubella virus nonstructural proteins Proteolytic processing of rubella virus nonstructural proteins We thank Dr. Shirley Gillam for providing M33 plasmid constructs pBRM33-C1152S and pBRM33-G1301S that were used in the study as PCR templates to generate the protease and substrate clones. We thank Drs. KVK Mohan and Ye, CBER for helpful critiques. HHC and CJS are supported by ORISE postdoctoral fellowships.