key: cord-0005810-n9u2em10 authors: Campbell, David A.; Thornton, Deborah A.; Boothroyd, John C. title: Apparent discontinuous transcription of Trypanosoma brucei variant surface antigen genes date: 1984 journal: Nature DOI: 10.1038/311350a0 sha: ece96b3f8f9b886c276a04b5fb12e67dc5fa6019 doc_id: 5810 cord_uid: n9u2em10 The repeated mini-exon sequence that encodes the first 35 base pairs of all variant surface antigen mRNAs of Trypanosoma brucei directs the synthesis of a discrete 137-nucleotide transcript. It thus seems that variant surface antigen mRNAs are transcribed discontinuously, and we present two alternative models for how this might occur. . Kataoka, T., Niakido, T., Miyata, T., Moriwaki, K. & Honjo, T. 1. bioL Chem. 257,277-285 (1982) . 34 . Clarke, C. el al. Nucleic Acids Res. 10, 7731-7749 (1982 38 . Queen, c. & B.ltimore, D. Cell 33, 741-748 (1983) . 39 . Banerji, J., Olson, L. & Schaffner, W. Cell 33,729-740 (1983) . 40 . Perry, R. P. er aL Proc. naln. Acad. ScL U.S. A. 77, 1937-1941 (1980) . 41 The repeated mini-exon sequence that encodes the first 35 base pairs of all variant surface antigen mRNAs of Trypanosoma brucei directs the synthesis of a discrete I37-nucleotide transcript. It thus seems that variant surface antigen mRNAs are transcribed discontinuously, and we present two alternative models for how this might occur. THE ability of African trypanosomes to establish chronic infections in their mammalian hosts depends on a highly developed system of antigenic variation, whereby individual members of the parasite population change the composition of their surface coae-3 . This antigenic variation is controlled at the level of gene expression 4 ,5, each trypanosome possessing a large repertoire of genes (estimated at over 100; refs 6, 7) coding for the antigenically distinct variant surface glycoproteins (VSGs) which comprise the coat. At anyone time and on anyone trypanosome, only one species ofVSG can be detected 8 , suggesting that mutual exclusion operates between the VSG genes. Activation of VSG genes occurs by a two-step process requiring first the duplication and transposition of a silent basic copy (BC) of the gene into one of a few telomeric expression sites by a process equivalent to gene conversion, then selection of the expression site over others for transcription9-ls. The extra copy of the gene thus produced is called the expression-linked copy (ELC). The socalled 'non-duplication-activated' VSG genes described by others probably represent genes which had already undergone the first step in activation (that is, gene conversion into an expression site) before the variants were isolated for study. Although the structure of expression sites and the sequences involved in the gene conversion event have been identified for some variants l 6--19, little is known about the second step, particularly how activation occurs and how the mutual exclusion operates between the different expression sites. Recently, a major clue to this problem has come from the finding that the 5' 35 nucleotides of VSG mRNAs are not contiguously encoded with the protein-coding portion of the gene i3 , and that this spliced leader segment is identical for different VSG mRNAs regardless of which telomeric expression site they occupyl4. The genomic location of the 35-base pair (bp) mini-exon coding for the spliced leader has recently been reported to be a 1.35-kilobase pair (kb) segment tandemly repeated 100-200 times (as one or more clusters) at an unidentified locus in the ~enome but at least 30 kb upstream of the active VSG gene 20 ,2 . Two other important observations concerning the spliced leader are that it is detected on many RNA molecules of varying size in blot analyses using RNA from Several models have been proposed to explain how a contiguous mRNA is produced from such an unusual arrangement of exons 2 0- 23 and how the activity of different expression sites is regulated. These have included chromosome end exchange with splicing of very long transcripts, and discontinuous transcription whereby the mini-exon and ELC are flanked by their own initiation and termination sites. To determine how VSG mRNAs are produced and processed, we have studied the 1.35kb repeats and identified their transcriptional products. To enable nucleotide sequence analysis of the mini-exon repeat, recombinant plasmids containing individual repeats were generated. This was initially done using a PvuII digest of the genomic DNA of T. brucei as this enzyme has been reported to release the mini-ex on-containing portion of the repeat as a discrete band at -720 bp21. Such a band was excised from an agarose gel, the DNA purified by electroelution 24 and ligated into the PvuII site of the plasmid vector pBR322 (ref. 25) . A sample of this same excised material was radiolabelled by nick-translation 26 and used as a probe in colony hybridization 27 , thereby identifying recombinants containing repetitive DNA. Four plasmids thus identified were further characterized and three were found to contain indistinguishable inserts with at least one RsaI site. As the mini-exon should include an RsaI site, nucleotide sequence analysis in the vicinity of this site was performed on one of the plasmids (pMEP.1). The strategy and sequence obtained are shown in Figs I and 2, respectively. This demonstrated unambiguously that pMEP.I contains a portion of the mini-exon repeat. To obtain a recombinant plasmid containing an insert representative of the complete 1.35-kb repeat, a ligation was performed using the 1. 35 - Pvull insert of pMEP.I and after preliminary restriction mapping, one (pMES.1) was characterized further and found to contain an apparently complete 1.35-kb mini-exon repeat. Using the recombinant plasmid pMES.I and the strategy shown in Fig. I , the complete nucleotide sequence of one repeat unit containing the mini-exon was determined. This is presented in Fig. 2 with position + I being the first nucleotide of the mini-exon. The region from -270 to +61 was also sequenced from the independently generated plasmid pMEP.I (see Fig. 1 ). The results from the two plasmids were identical in this region. Although Southern blot analysis suggested that the repeat unit was defined by a single Sau3A site (refs 20, 21 and D.A.T. and J.C.B., unpublished results), the possibility remained that two closely adjacent Sau3A sites were contained within each repeat and that pMES.I, therefore, lacked a small portion of the repeat. To test this, a further recombinant plasmid contai~ing a mini-exon repeat was generated using an SphI digest of genomic DNA (as cloned in a bacteriophage A recombinant containing several1.35-kb repeats; D.A.C., unpublished results) and ligated to SphI-digested pATl53. SphI was known to cut the repeat once (from sequence analysis of pMES.I and Southern blot analyses; D.A.C., unpublished results) and thus should give a fragment spanning the Sau3A site in question. Mini-exon-containing plasmids were identified as above and pMEH.l was obtained, mapped and sequenced (see Fig. 1 ). From this plasmid, a contiguous sequence spanning the Sau3A site used to construct pMES.l was obtained (data not shown), demonstrating that the sequence presented in Fig. 2 represents a complete mini-exon repeat. To ensure that this sequence was not only complete but also representative, we compared its restriction map with the fragment sizes observed on Southern blots of genomic DNA cut with several enzymes, using the Pvull insert of pMEP.I as a radiolabelled probe (data not shown). The predominant bands obtained in each digest agreed in all cases with the fragment sizes predicted from the sequence; this demonstrated that the mini-exon repeats are highly conserved and that pMES.I contains a typical member. A testable prediction of the model proposing discontinuous transcription is that there should be detectable levels of a small RNA (that is, < 1.35 kb) derived from each mini-exon repeat. One of the most sensitive methods for detecting and characterizing particular transcripts is St-protection 29 , whereby a radiolabelled DNA probe from the region of interest is denatured and renatured in the presence of RNA; the reSUlting material is then digested with the single-strand-specific S[ nuclease and the length of any protected fragments measured by gel electrophoresis and autoradiography. The DNA substrate used here was from pMES.I, extending from the [3,-32 PJ-labelled Pvull site at position +58, downstream through the Sau3A site (position +183), to the EcoRI site in the vector (see Fig. I ). This fragment was mixed with total trypanosome bloodstream-form RNA (or sucrose-gradient fractions thereof), precipitated, redissolved, denatured and renatured in conditions favouring RNA/DNA hybridization 3o • This mixture was then cooled, diluted and digested with St nuclease, before electrophoresis on a 7 M urea polyacrylamide gel. The resulting autoradiogram (Fig. 3 a) shows a single major end point of protection by total RNA which corresponds to nucleotide +137 in the sequence presented in Fig. 2 . Minor bands immediately flanking this protected region may be indicative of slight heterogeneity at the 3' end of the RNA or nonspecific digestion by S[ nuclease at the end of the RNA/DNA duplex. This result could be interpreted as protection by a short RNA of 137 nucleotides derived from the mini-exon repeats and/or by a longer transcript with a discontinuity at this position in the RNA/DNA duplex. To discriminate between these two alternatives, two procedures were used. The first examined the St-protection by different size fractions of RNA produced by sucrose-gradient centrifugation (Fig. 3b) . Maximal protection was found in fractions 19-21, which possessed RNA with a peak size slightly larger than 4S RNA (that is, -100-200 nucleotides), with diminishing but detectable protection in the remaining fractions (Fig. 3a) . This strongly suggested that the protection observed with total RNA was due, at least in part, to the presence of a small RNA. The protection observed with other fractions, particularly those containing larger RNA, may be due to contamination with the small RNA. Alternatively, they may be a result of protection by hybrid transcripts comprising the small RNA at their 5/ ends linked to RNA from, for example, the ELC region, by the mechanisms discussed below. The second method to detect and size transcripts from the mini-exon repeat used Northern blot analysis 31 • For this procedure, total trypanosome RNA was resolved by polyacrylamide gel electrophoresis (PAGE) in denaturing conditions, transferred to GeneScreen membrane 32 and hybridized with a [5/ -32 P]_labelled, synthetic oligonucleotide complementary to positions + 117 to +133. This 17 -mer was chosen because it excluded the mini-exon itself, which was known to give intense hybridization throughout RNA blots when used as a probe 21 and because the Sl-protection experiments indicated that it should hybridize to mini-ex on-derived transcripts. Figure 4a shows the ethidium bromide stain of a polyacrylamide/7 M urea gel in which the major small RNA species is readily apparent. Figure 4b presents the autoradiogram produced after hybridizing a GeneScreen transfer of a duplicate half of this gel with the 17-mer; only one band is detected in the track containing trypanosome RNA and no signal is observed in the track containing RNA from the unrelated parasitic protozoan, Toxoplasma gondii. The size of the band is -140 nucleotides based on published values for the small ribosomal RNAs of trypanosomes 33 , thus confirming that the mini-ex on repeats are transcribed to yield a short, discrete RNA which we have termed mini-exon-derived RNA (medRNA). With regard to its precise size, this result is, within experimental error, consistent with medRNA being 137 nucleotides long. We cannot, however, exclude the possibility that medRNA is as many as 4-5 nucleotides longer and that Sl nuclease nonspecifically digested the 3/ end of the RNA/DNA hybrid. Note, however, that the nucleotides at positions 138 and 140 are cytosines which, as CG pairs, would be less susceptible to such action. A definite answer awaits 3/-sequence analysis of medRNA. Methods: a, A [3/-32 P]-end-labelled Pvull digest of pMES.1 was re-cut with EcoRI to generate a uniquely end-labelled fragment corresponding to positions +59 to +183 plus 375 bp of the vector (see Fi §. I). S} protection was done essentially as described elsewhere 3 • Briefly, the probe was mixed with different fractions of bloodstream-fonn RNA, precipitated, redissolved and denatured in 80% fonnamide at 80·C for 10 min. Renaturation of RNA/DNA hybrids was promoted by incubating at 45 ·C for 3 h in the same buffer, before diluting with 10 vol of aqueous buffer, cooling to 30·C and adding 35 units ofS} nuclease. Digestion was for 30 min, after which the reactions were phenol-extracted, isopropanolprecipitated, resuspended in fonnamide loading buffer 43 and resolved by PAGE in a 6% polyacrylamide gel with 7 M urea 44 • b, A 15-30% linear sucrose gradient was prepared in 0.1 M NaCI, I mM EDTA, 20 mM Tris-HCI (pH 7.5), 0.1 % SDS. About I mg of total trypanosome RNA was overlaid with the same buffer and the samples spun in an SW-41 rotor (Beckman) at 40,000 r.p.m. at 20·C for 9.5 h. Fractions were collected from the bottom of the tube and the A 260 measured. For the S}-protection experiments, 5% of each sample was removed and pooled in groups of three (numbered by the middle fraction). These results also suggest that the 5/ -terminus of medRN A is at about position + 1 in the mini-ex on repeat. To confirm this, we used the synthetic oligonucleotide from the RNA blot analyses as a primer for reverse transcriptase extensions of medRNA. The result (Fig. 5) shows a major band co-migrating with a heterologous DNA marker of 135 nucleotides. Using a different heterologous marker (a HindIII digest of bacteriophage A DNA), the size of this band was estimated as 133 nucleotides. This slight discrepancy is probably due to differences in base composition. Allowing for this variation, the extension has an approximate length of 134 ± 1 nucleotides. Given that the 5'-end of the primer represents position +133 of the mini-exon repeat, this result confirms that the 5'-end of medRNA corresponds to position +1, the same as that found for the spliced leader of mature VSG mRNAs I3 ,14. The minor band in Fig. 5 at 138± I nucleotides may be an artefact of reverse transcriptase (for example, loop-back synthesis) or it may be indicative of a minor RNA 4 nucleotides longer than the major species. Such minor heterogeneity might be due to variation in the mini-exon repeat sequences and/or in the site of transcriptional initiation. No other major bands were observed in other loadings of this sample run for longer or shorter periods (data not shown). We have cloned and sequenced a complete copy of the 1.35-kb mini-exon repeat of T. brucei and identified a short RNA of 137 nucleotides as its transcriptional product. The data indicate that this repeat is a typical member of a highly conserved family-this is demonstrated further by the virtually complete agreement of the sequence presented in Fig. 2 with the partial sequence for the region from -130 to +61 recently published elsewhere 2 !; the four differences (three single base changes, one insertion) are all localized to between -115 and -125. As the sequences in both cases were determined from two plasmids, the differences may be indicative of minor, but real, variation in the repeats. In addition to the region coding for medRNA, each 1.35-kb mini-exon repeat contains about 1,200 nucleotides whose function is not clear; part of this region is presumably involved in the regulation of medRNA expression, but the remainder could be involved in some other aspect of trypanosome gene expression. Of particular note is the repetitive nature of the sequence around the Sau3A site, consisting of 1-, 2-, 4-and 6-nucleotide repeats (T, AT/ AC, ATTT/GTTT and ACACTC, respectively). It will be interesting to determine whether these regions of the repeat are as exactly conserved as that coding for the medRNA. The predicted sequence of medRNA can be drawn with several alternative regions of intramolecular base-pairing (data not shown). This suggests that medRNA might adopt a stable conformation with substantial secondary structure, but, as we have no data on RNase sensitivity, we cannot present a reliable structure. The detection of medRNA raises questions about the signals directing its transcription. As noted by De Lange et a/. 2I Further studies on relative drug sensitivities and 5' -cap structures are needed to confirm medRNA as an RNA polymerase II transcript. The termination site cannot be compared with the general case as too few sequences of eukaryotic transcriptional terminators have been reported to enable consensus sequences to be made. Recently, the complete sequence of a small nuclear RNA (snR3) and its gene have been reported in yease Z , 35. This transcription unit shows remarkable similarity to the mini-exon repeat at four critical positions (Fig. 6) , the remainder of the two sequences sharing no significant homologies. First, the snR3 gene also has the octanucleotide T A TTTTTG centred at position -27 relative to the start site for transcription. Second, the 35~4 _____________________________________________ ARTICLES ____________________ ~N2A~TU~R=E_V~O~L~.~3~11~2~7~S=E~PT~E=M~B=E=R~19~84 initiation site for transcription lies within a heptanucleotide sequence which is very similar in the two genes (six out of seven bases identical). Third, both sequences, although of different overall length (137 versus 194 nucleotides for medRNA and snR3, respectively), end near the beginning of an extremely T-rich region. Finally, the region containing the putative donor splice site of the medRNA also shares a 5-out of 6-residue homology with the yeast snR3, the potential significance of which is discussed below. The finding of medRN A fulfils one of the predictions of a model for VSG gene expression involving discontinuous transcription. However, it is not definitive because of the multiplicity of the mini-exon repeats: it could be that, although the majority of the repeats are indeed transcribed to give the short medRNA, one repeat is used in the initiation of a very long transcript including the VSG coding exon. This is a difficult possibility to exclude because pulse-chase experiments, which have been traditionally used to demonstrate precursor/product relationships, suffer from the same shortcoming-it is impossible to demonstrate that a given mini-exon repeat is used for one purpose compared with another, indistinguishable repeat. Nevertheless, together with published observations regarding the abundance of transcripts containing mini-exon-derived sequences and the inability to detect linkage between a mini-exon sequence and the VSG coding ex on, the identification of a short, discrete medRNA strongly supports a model of discontinuous transcription. In this model, the mini-ex on-derived leader segment of VSG a Reinitiation Mini -exon repeats ? mRNAs is transcribed as a discrete product of the mini-exon repeats and, by one of several possible mechanisms, is ultimately found attached to the 5'-end of the VSG precursor RNA. The chimaeric intervening sequence is then removed by splicing, bringing together the 35-nucleotide leader with the coding portion to give the final mRNA of -1.7 kb. There are two general schemes by which this can most easily be imagined to occur. The first model, presented in Fig. 7 a, predicts that RNA polymerase transcribes the medRNA and then dissociates from the mini-exon repeat, possibly as a medRNA/polymerase complex; transcription then reinitiates just upstream of the ELC using medRNA as a primer. The second model (Fig. 7 b) proposed that medRNA and the codingregion RNA are independently produced as discrete transcripts and then post-transcriptionally ligated to each other. In both cases, splicing ultimately removes the intervening sequence to generate the ma.ture mRNA. The critical difference between the two alternatives is that post-transcriptional ligation requires a functional promoter (in the sense of a site for de novo initiation of transcription) upstream of the ELC whereas the reinitiation model proposes a site in this same position where RNA polymerase could only act in the presence of a primer (that is, medRNA). We are now attempting to test for this distinction in vitro. We have previously reported the sequence upstream of an expressed copy of a VSG gene and compared it with a silent copy of that gene 19 • We found that the upstream region of the expressed copy consisted of multiple tandem repeats of -76 bp flanking an unusual sequence of (T AA)90. Although three 76-bp repeats are found upstream ofthe silent copy, (T AA)90 is unique to the expressed copy of this gene-and is, therefore, a candidate for the (re)initiation site upstream of expressed VSG genes. If this sequence is unique and mobile, it could be the basis for the mutual exclusion observed between the different expression sites. Equally, another, as yet undetected, mobile control element might have this role. A critical question raised by the results reported here is whether discontinuous transcription involving a small RNA is a unique property of T. brucei. Is it, for example, a special adaptation which has co-evolved with antigenic variation? This is argued against by the finding that the same or similar miniexon sequences are found in many RNAs (of unknown coding function), not only from bloodstream-forms of the parasite, but also from the procyclic insect forms where VSG genes are not In addition, the mini-exon sequence (or a conserved homologue) has been shown to be present in the genomes of several related species and genera which lack antigenic variation of the type observed for T. brucej22. Could discontinuous transcription be operating in these other cases? This question cannot, as yet, be answered, but some precedent does exist in systems which are otherwise totally unrelated. Transcription of the influenza virus is known to require a 5'-capped primer (10-15 nucleotides long) derived from cleavage of host mRNAs 36 ,37. Coronavirus transcripts, on the other hand, are known to have a common 5'-leader sequence which is seemingly not joined to the coding portion of the RNA by conventional splicing, again strongly suggesting discontinuous transcription 38 ,39. The fact that, in both these cases, transcription occurs by an RNA-dependent viral transcriptase may be an important distinction but they do provide a mechanistic precedent for discontinuity in primary transcripts. The LyttIeton hypothesized long ago that Triton and Pluto originated as adjacent prograde satellites of Neptune l • With the presently accepted masses of Triton and Pluto-Charon 1 , \ however, the momentum and energy exchange that would be required to set Triton on a retrograde trajectory is impossible. The mass of Triton has probably been seriously overestimated 4 ,5, but not by enough to relax this restriction. It is implausible that the present angular momentum state of Pluto-Charon has been significantly influenced by Neptune'. It could not acquire such angular momentum during an ejection event unless a physical collision was involved, which is quite unlikely. The simplest hypothesis is that Triton and Pluto are independent representatives of large outer Solar System planetesimals. Triton is simply captured, with potentially spectacular consequences that include runaway melting of interior ices and release to the surface of clathrated CH 4 , CO and Nl (ref. 7) . Condensed remnants of this proto-atmosphere could account for features in Triton's unique spectrumS-H. The dynamics of Triton's orbital evolution are considerably simplified by the fact that its specific dissipation function, Q, at tidal frequencies, is much less than that of Neptune (QT« QN)' Here I assume a standard solid-body Q for Triton of -100 (ref. 12) . A lower bound on QN can be derived by requiring that the outward orbital eyolution of a satellite given by is not so rapid that the satellite originated at the corotation radius of Neptune 4,500 Myr ago l2 (where ms and as are the satellite's mass and semimajor axis; m N , RN and k2N are Neptune's mass, radius, and tidal-effective potential Love number of the second degree; and G is the gravitational constant). k2N is estimated at 0.43, subject to uncertainties in Neptune's rotation rate and 12 (the coefficient of the second harmonic of the gravitational potential) (see ref. 13); other parameter values are given in Table I . If Neptune's third satellite l4 is confirmed and proves to be regular and non-commensurate, then QN;? 10 4 • A lower bound on the Q of Uranus of -2 x 10 4 is set using Miranda 15 and a kz for Uranus of 0.28. The Qs of both planets should be comparable, and are probably much larger. Accordingly, the monthly radial tide raised on Triton by Neptune dominates Triton's orbital evolution, except for orbits of very small eccentricity. This tide does not transfer angular momentum for a synchronously-rotating Triton, but in the cases of interest here, such non-synchronous spin angular momentum would be negligible compared with orbital angular momentum. The present fractional rate of change in Triton's orbital angular momentum, due to the tide raised on Neptune by Triton, is Lc Ray. D. & Steinert, M. Proc. nat". Acad. ScL U.S.A. 78 Such activity could further provide a molecular basis for the long-standing observation that heterogeneous nuclear RNAs often possess sequences at their 5'-ends which are derived from middle repetitive DNA 4 O-42 and that such sequences might be important in the control of developmentally regulated gene expression Stuart and their colleagues for exchange of information before publication Fedor for producing the synthetic oligonucleotide; and Ms M. A. Siri for secretarial help. This work was supported in part by a grant from the Molecular Cloning. A Laboratory Manual Proc. natn. Acad. SCL U.S.A