key: cord- -mywhe w authors: clausen, thomas mandel; sandoval, daniel r.; spliid, charlotte b.; pihl, jessica; perrett, hailee r.; painter, chelsea d.; narayanan, anoop; majowicz, sydney a.; kwong, elizabeth m.; mcvicar, rachael n.; thacker, bryan e.; glass, charles a.; yang, zhang; torres, jonathan l.; golden, gregory j.; bartels, phillip l.; porell, ryan; garretson, aaron f.; laubach, logan; feldman, jared; yin, xin; pu, yuan; hauser, blake; caradonna, timothy m.; kellman, benjamin p.; martino, cameron; gordts, philip l.s.m.; chanda, sumit k.; schmidt, aaron g.; godula, kamil; leibel, sandra l.; jose, joyce; corbett, kevin d.; ward, andrew b.; carlin, aaron f.; esko, jeffrey d. title: sars-cov- infection depends on cellular heparan sulfate and ace date: - - journal: cell doi: . /j.cell. . . sha: doc_id: cord_uid: mywhe w we show that sars-cov- spike protein interacts with both cellular heparan sulfate and angiotensin converting enzyme (ace ) through its receptor binding domain (rbd). docking studies suggest a heparin/heparan sulfate-binding site adjacent to the ace binding site. both ace and heparin can bind independently to spike protein in vitro and a ternary complex can be generated using heparin as a scaffold. electron micrographs of spike protein suggests that heparin enhances the open conformation of the rbd that binds ace . on cells, spike protein binding depends on both heparan sulfate and ace . unfractionated heparin, non-anticoagulant heparin, heparin lyases, and lung heparan sulfate potently block spike protein binding and/or infection by pseudotyped virus and authentic sars-cov- virus. we suggest a model in which viral attachment and infection involves heparan sulfate-dependent enhancement of binding to ace . manipulation of heparan sulfate or inhibition of viral adhesion by exogenous heparin presents new therapeutic opportunities. the covid- pandemic, caused by the novel respiratory coronavirus (sars-cov- ), has swept across the world, resulting in serious clinical morbidities and mortality, as well as widespread disruption to all aspects of society. as of september , , the virus has spread to countries, causing more than . million confirmed infections and at least , deaths (world health organization). current isolation/social distancing strategies seek to flatten the infection curve to avoid overwhelming hospitals and to give the medical establishment and pharmaceutical companies time to develop and test antiviral drugs and vaccines. currently, only one antiviral agent, remdesivir, has been approved for adult covid- patients (beigel et al., ) and vaccines may be - months away. understanding the mechanism for sars-cov- infection and its mechanism of infection could reveal other targets to interfere with viral infection and spread. the glycocalyx is a complex mixture of glycans and glycoconjugates surrounding all cells. given its location, viruses and other infectious organisms, must pass through the glycocalyx to engage receptors thought to mediate viral entry into host cells. many viral pathogens have evolved to utilize glycans as attachment factors, which facilitates the initial interaction with host cells, including influenza virus, herpes simplex virus, human immunodeficiency virus, and different coronaviruses (sars-cov- and mers-cov) (cagno et al., ; koehler et al., ; stencel-baerenwald et al., ) . several viruses interact with sialic acids, which are located on the ends of glycans found in glycolipids and glycoproteins. other viruses interact with heparan sulfate (hs) (milewska et al., ) , a highly negatively charged linear polysaccharide that is attached to a small set of membrane or extracellular matrix proteoglycans (lindahl et al., ) . in general, glycan-binding domains on membrane proteins of the virion envelope mediate initial attachment of virions to glycan receptors. attachment in this way can lead to the engagement of protein receptors on the host plasma membrane that facilitate membrane fusion or engulfment and internalization of the virion. j o u r n a l p r e -p r o o f like other macromolecules, hs can be divided into subunits, which are operationally defined as disaccharides based on the ability of bacterial enzymes or nitrous acid to cleave the chain into disaccharide units (esko and selleck, ) . the basic disaccharide subunit consists of α - linked d-glucuronic acid (glca) and α - linked n-acetyl-d-glucosamine (glcnac), which undergo various modifications by sulfation and epimerization as the copolymer assembles on a limited number of membrane and extracellular matrix proteins (only heparan sulfate proteoglycans are known) (lindahl et al., ) . the variable length of the modified domains and their pattern of sulfation create unique motifs to which hs-binding proteins interact (xu and esko, ) . different tissues and cell types vary in the structure of hs, and hs structure can vary between individuals and with age (de agostini et al., ; feyzi et al., ; han et al., ; ledin et al., ; vongchan et al., ; warda et al., ; wei et al., ) . these differences in hs composition may contribute to the tissue tropism and/or host susceptibility to infection by viruses and other pathogens. in this report, we show that the ectodomain of the sars-cov- spike (s) protein interacts with cell surface hs through the receptor binding domain (rbd) in the s subunit. binding of heparin to sars-cov- s protein shifts the structure to favor the rbd open conformation that binds ace . spike binding to cells requires engagement of both cellular hs and ace , suggesting that hs acts as a coreceptor priming the spike for ace interaction. therapeutic unfractionated heparin (ufh), non-anticoagulant heparin and hs derived from human lung and other tissues blocks binding. ufh and heparin lyases also block infection of cells by s protein pseudotyped virus and authentic sars-cov- . these findings identify cellular hs as a necessary co-factor for sars-cov- infection and emphasizes the potential for targeting s protein-hs interactions to attenuate virus infection. the trimeric s proteins from sars-cov- and sars-cov- viruses are thought to engage human ace with one or more rbd in an "open" active conformation (fig. a ) (kirchdoerfer et al., ; walls et al., ; wrapp et al., ) . adjacent to the ace binding site and exposed in the rbd lies a group of positively-charged amino acid residues that represents a potential site that could interact with heparin or heparan sulfate ( fig. a and suppl. fig. s ). we calculated an electrostatic potential map of the rbd (from pdb id m (yan et al., ) ), which revealed an extended electropositive surface with dimensions and turns/loops consistent with a heparin-binding site (fig. b) (xu and esko, ) . docking studies using a tetrasaccharide (dp ) fragment derived from heparin demonstrated preferred interactions with this electropositive surface, which based on its dimensions could accommodate a chain of up to monosaccharides ( fig. b and c ). evaluation of heparin-protein contacts and energy contributions using the molecular operating environment (moe) software suggested strong interactions with the positively charged amino acids r , r , k , r and possibly r (figs. a, d, and e) . other amino acids, notably f , s , n , g , y , and y , could coordinate the oligosaccharide through hydrogen bonds and hydrophobic interactions. notably, the putative binding surface for oligosaccharides is adjacent to, but separate from the ace binding site, suggesting that a single rbd could simultaneously bind both cell surface hs and the ace protein receptor. the putative hs binding site is partially obstructed in the "closed" inactive rbd conformation, while fully exposed in the open state (suppl. fig. s ). the amino acid sequence of s protein rbd of sars-cov- s is % identical to the rbd of sars-cov- s (fig. f) , and these domains are highly similar in structure with an overall cα r.m.s.d. of . Å (fig. g) . however, an electrostatic potential map of the sars-cov- s j o u r n a l p r e -p r o o f rbd does not show an electropositive surface like that observed in sars-cov- (fig. h ). most of the positively charged residues comprising this surface are conserved between the two proteins, with the exception of sars-cov- k which is a threonine in sars-cov- (fig. f ). additionally, the other amino acid residues predicted to coordinate with the oligosaccharide are conserved with the exception of asn in sars-cov- , which is a negatively charged glutamate residue in sars-cov- . sars-cov- has been shown to interact with cellular hs in addition to its entry receptors ace and transmembrane protease, serine (tmprss ) (lang et al., ) . our analysis suggests that the putative heparin-binding site in sars-cov- s may mediate an enhanced interaction with heparin or hs compared to sars-cov- , and that this change evolved through as few as two amino acid substitutions, thr lys and glu asn . to test experimentally if the sars-cov- s protein interacts with heparin/hs, recombinant ectodomain and rbd proteins were prepared and characterized. initial studies encountered difficulty in stabilizing the s ectodomain protein, a problem that was resolved by raising the concentration of nacl to . m in hepes buffer. under these conditions, the protein could be stored at room temperature, o c or at - o c for at least two weeks. sds-page showed that each protein was ~ % pure ( j o u r n a l p r e -p r o o f recombinant s ectodomain and rbd proteins were applied to a column of heparin-sepharose. elution with a gradient of sodium chloride showed that the rbd eluted at ~ . m nacl, with a shoulder that eluted with higher salt (fig. a) . recombinant s ectodomain also bound to heparin-sepharose, but it eluted across a broader concentration of nacl. the elution profiles suggest that the preparations contained a population of molecules that bind to heparin, but that some heterogeneity in affinity for heparin occurs, which may reflect differences in glycosylation, oligomerization or the number of binding sites in the open conformation. the rbd protein from sars-cov- also bound in a saturable manner to heparin-bsa immobilized on a plate (fig. b ). the rbd domain from sars-cov- showed significantly reduced binding to heparin-bsa and a higher k d value ( nm [ % c.i.; - nm] for sars-cov- rbd vs. nm [ % c.i. - nm]) for sars-cov- rbd), in accordance with the difference in electropositive potential in the proposed hs binding regions (fig. h) . a monomeric form of sars-cov- s ectodomain protein also bound in a saturable manner to heparin immobilized on a plate (suppl. fig. s a ). the trimeric protein bound to heparin-bsa with an apparent k d value of . nm [ % c.i. . - . nm] (fig. c ). binding of recombinant s ectodomain, mutated to lock the rbds into a closed (mut ) or that favors an open (mut ) conformation, showed that the heparin binding site in the rbd domain is accessible in both conformations (fig. d ). however, the k d value for mut is lower ( . nm [ % c.i. . - . nm] vs. . nm [ % c.i. . - . nm] for mut ), which is in line with the partial obstruction of the site in the closed conformation (suppl. fig. s ). as expected, only trimer with an open rbd conformation bound to ace (fig. e ). in contrast to spike protein, ace did not bind to heparin-bsa (fig. c) . ace also had no effect on binding of s protein to heparin-bsa at all concentrations that were tested (fig. c , inset). biotinylated ace bound to immobilized s protein (suppl. fig. s b ) and a ternary complex of heparin, ace and s protein could be demonstrated by titration of s protein bound to immobilized heparin-bsa with ace (fig. f ). binding of ace under these conditions j o u r n a l p r e -p r o o f increased in proportion to the amount of s protein bound to the heparin-bsa. collectively, these findings show that (i) spike protein can engage both heparin and ace simultaneously and (ii) that the heparin binding site is somewhat occluded in the closed conformation, but it can still bind heparin albeit with reduced affinity. the simultaneous binding of ace to spike protein and heparin suggested the possibility that heparin binding might affect the conformation of the rbd, possibly increasing the open conformation that can bind ace . to explore this possibility, spike protein was mixed with ace ( -fold molar ratio) with or without dp oligosaccharides derived from heparin ( -fold molar ratio). the samples were then stained and analyzed by transmission electron microscopy, and the images were deconvoluted and sorted into d reconstructions to determine the number of trimers with , , , or bound ace (fig. g -h and suppl. fig. s c-d) . the different populations were counted and the percentage of particles belonging to each d class was calculated. two time points were evaluated after mixing ace and trimeric s: at min , and , particles were analyzed in the absence or presence of dp oligosaccharides, respectively; at min, , and , particles were analyzed in absence or presence of dp oligosaccharides, respectively. at both time points, the presence of dp increased the total amount of ace protein bound to spike . after minutes in the absence of dp very few of the trimers had conformations with or bound ace ( % each), whereas the inclusion of dp oligosaccharides greatly increased the proportion of trimers bearing one ( %) or two ( %) ace , with a proportional drop in the unbound conformers from % in the absence of heparin to % in its presence (fig. g ). extending the incubation to minutes resulted in a mixture of trimers containing ( %), ( %) and ace ( %) in the absence of heparin. inclusion of dp further increased the proportion of bound spike trimers bearing ( %), and ( %) ace (fig. h) . the imaging studies suggest that, under these j o u r n a l p r e -p r o o f experimental conditions, heparin may stabilize the ace interaction, increasing the proportion of spike bound to ace as well as the occupancy of individual spikes. the sars-cov- spike protein depends on cellular heparan sulfate for cell binding. to extend these studies to hs on the surface of cells, s ectodomain protein was added to human h cells, an adenocarcinoma cell line derived from type alveolar cells (fig. a ). spike ectodomains bound to h cells, with half-maximal binding achieved at ~ nm. treatment of the cells with a mixture of heparin lyases (hsase), which degrades cell surface hs, dramatically reduced binding ( fig a) . the s ectodomain also bound to human a cells, another type alveolar adenocarcinoma line, as well as human hepatoma hep b cells (fig. b ). removal of hs by enzymatic treatment dramatically reduced binding in both of these cell lines as well (fig. b ). recombinant rbd protein also bound to all three cell lines dependent on hs (fig. c) . a melanoma cell line, a , was tested independently and also showed hs dependent binding ( fig d) . the extent of binding across the four cell lines varied ~ -fold. this variation was not due to differences in hs expression as illustrated by staining of cell surface hs with mab e , which recognizes a common epitope in hs ( we also measured binding of the s ectodomain and rbd proteins to a library of mutant hep b cells, carrying crispr/cas induced mutations in biosynthetic enzymes essential for synthesizing hs (anower et al., ) . inactivation of ext , a subunit of the copolymerase required for synthesis of the backbone of hs, abolished binding to a greater extent than enzymatic removal of the chains with hsases ( fig. f and suppl. fig. s ), suggesting that the hsase treatment may underestimate the dependence on hs. targeting ndst , a glcnac n-j o u r n a l p r e -p r o o f deacetylase-n-sulfotransferase that n-deacetylates and n-sulfates n-acetylglucosamine residues, and hs st and hs st , which introduces sulfate groups in the c position of glucosamine residues, significantly reduced binding (figs. f and suppl. fig. s ). although experiments with other sulfotransferases have not yet been done, the data suggests that the pattern of sulfation of hs affects binding to s and rbd. to further examine how variation in hs structure affects binding, we isolated hs from human kidney, liver, lung and tonsil. the samples were depolymerized into disaccharides by treatment with hsases, and the disaccharides were then analyzed by lc-ms (experimental methods). the disaccharide analysis showed that lung hs has a larger proportion of ndeacetylated and n-sulfated glucosamine residues (grey bars) and more -o-sulfated uronic acids (green bars) than hs preparations from the other tissues (fig. a ). the different hs preparations also varied in their ability to block binding of rbd to h cells (fig. b ). interestingly, hs isolated from lung was more potent compared to kidney and liver hs, consistent with the greater degree of sulfation of hs from this organ (suppl. table ). hs from tonsil was as potent as hs from lung, but the overall extent of sulfation was not as great, supporting the notion that the patterning of the sulfated domains in the chains may affect binding. unfractionated heparin is derived from porcine mucosa and possesses potent anticoagulant activity due to the presence of a pentasaccharide sequence containing a crucial -o-sulfated nsulfoglucosamine unit, which confers high affinity binding to antithrombin. heparin is also very highly sulfated compared to hs with an average negative charge of - . per disaccharide (the overall negative charge density of typical hs is - . to - . per disaccharide). mst cells, which were derived from a murine mastocytoma, make heparin-like hs that lacks the key -o-sulfate group and anticoagulant activity (gasimli et al., ; montgomery et al., ) . the anticoagulant properties of heparin can also be removed by periodate oxidation, which oxidizes the vicinal hydroxyl groups in the uronic acids, resulting in what is called "split-glycol" heparin (casu et al., ) . all of these agents significantly inhibited binding of the s protein to h and a cells ( fig. c and d ) yielding ic values in the range of . - . µg/ml (suppl . table ). interestingly, the lack of -o-sulfation, crucial for the anticoagulant activity of heparin, had little effect on its inhibition of s binding. in contrast, cho cell hs (containing . sulfates per disaccharide) only weakly inhibited binding (ic values of and µg/ml for a and h , respectively) (suppl. table ). these data suggest that inhibition by heparinoids is most likely charge dependent and independent of anticoagulant activity per se. the experiments shown in fig. g -h indicate that binding of heparin to spike protein can increase binding to ace . to explore if hs, ace and spike interact at the cell surface, we investigated the impact of ace expression on s protein cell binding. initial attempts were made to measure ace levels by western blotting or flow cytometry with different mabs and polyclonal antibodies, but a reliable signal was not obtained in any of the cell lines tested (a , a , h , and hep b). nevertheless, expression of ace mrna was observed by rt-qpcr (suppl. fig. a ). transfection of a cells with ace cdna resulted in robust expression of ace (fig. a) , resulting in an increase in s ectodomain protein binding by ~ fold (fig. b) . interestingly, the enhanced binding was hs-dependent, as illustrated by the loss of binding of s protein after hsase-treatment (fig. b ). crispr/cas mediated deletion of the b galt gene, which is required for glycosaminoglycan assembly (suppl. fig. s b ), also reduced binding of spike protein (fig. b ) despite the overexpression of ace (fig. a ). to explore the impact of diminished ace expression, we examined spike protein binding to a cells and in two crispr/cas gene targeted clones c and c bearing biallelic mutations in ace (suppl. fig. s c ). binding of s ectodomain protein was greatly reduced in the ace -/-j o u r n a l p r e -p r o o f clones and the residual binding was sensitive to hsases (fig. c ). these findings show that binding of spike protein on cells requires both hs and ace , consistent with the formation of a ternary complex (figs. f-h). assays using purified components provide biochemical insights into binding, but they do not recapitulate the multivalent presentation of the s protein as it occurs on the virion membrane. thus, to extend these studies, pseudotyped vesicular stomatitis virus (vsv) was engineered to express the full-length sars-cov- s protein and gfp or luciferase to monitor infection. vero e cells are commonly used in the study of sars-cov- infection, due to their high susceptibility to infection. spike protein binding to vero cells also depends on cellular hs as binding was sensitive to hsases, heparin and split-glycol heparin (fig. a ). interestingly, hsase treatment reduced binding to a lesser extent than the level of reduction observed in a , heparin very potently reduced infection more than ~ -fold at . µg/ml and higher concentrations (fig. g) . in contrast, studies of sars-cov- s protein pseudotype virus showed that hsase-treatment actually increased sars-cov- infection by more than -fold, suggesting that hs might interfere with binding of sars-cov- in this cell line (fig. h ). infection of h and a cells by sars-cov- s pseudotype virus was too low to obtain j o u r n a l p r e -p r o o f accurate measurements, but infection of hep b cells could be readily measured (fig. i ). hsase and mutations in ext and ndst dramatically reduced infection -to -fold. inactivation of the -o-sulfotransferases had only a mild effect unlike its strong effect on s protein binding (fig. f) , possibly due to the high valency conferred by multiple copies of s protein on the pseudovirus envelope. hep b cells were not susceptible to infection by sars-cov- s protein pseudotyped virus, but was infected by mers-cov s protein pseudotyped virus and infection was independent of hs (suppl. fig. s ). studies of pseudovirus were then extended to authentic sars-cov- virus infection using strain usa-wa / . infection of vero e cells was monitored by double staining of the cells with antibodies against the sars-cov- nucleocapsid (n) and s proteins ( heparin inhibition (maroon and blue symbols). to rule out that the treatments caused a decrease in ace expression or a reduction in cell viability, vero cells were treated with heparin lyases and µg/ml ufh, and ace expression was measured by western blotting and cell viability by celltiter-blue® (suppl. fig. s a -b) . no effect on ace expression or cell viability was observed. these findings further emphasize the potential for using unfractionated heparin or other non-anticoagulant heparinoids to prevent viral attachment. j o u r n a l p r e -p r o o f these findings were then extended to hep b cells and mutants altered in hs biosynthesis using a viral plaque assay. virus was added to wildtype, ndst -/and hs st / -/cells for hr, the virus was removed, and after days incubation a serial dilution of the conditioned culture medium was added to monolayers of vero e cells. the number of plaques were then quantitated by staining and visualization. as a control, culture medium from infected vero e cells was tested, which showed robust viral titers. hep b cells also supported viral replication, but to a lesser extent than vero cells. inactivation of ndst in hep b cells abolished virus production, whereas inactivation of hs st / -/reduced infection more mildly, ~ -fold (fig. d) . hsase and ufh reduced infection more than -fold, but it had no effect on cell viability (suppl. in this report, we provide compelling evidence that hs is a necessary host attachment factor that promotes sars-cov- infection of various target cells. the receptor binding domain of the sars-cov- s protein binds to heparin/hs, most likely through a docking site composed of positively charged amino acid residues aligned in a subdomain of the rbd that is separate from the site involved in ace binding (fig. ) . competition studies, enzymatic removal of hs, and genetic studies confirm that the s protein, whether presented as a recombinant protein (figs. -j o u r n a l p r e -p r o o f ), in a pseudovirus (fig. ) , or in authentic sars-cov- virions (fig. ) , binds to cell surface hs in a cooperative manner with ace receptors. mechanistically, binding of heparin/hs to spike trimers enhances binding to ace , likely increasing multivalent interactions with the target cell. this data provides crucial insights into the pathogenic mechanism of sars-cov- infection and suggests hs-spike protein complexes as a novel therapeutic target to prevent infection. the glycocalyx is the first point of contact for all pathogens that infect animal cells, and thus it is not surprising that many viruses exploit glycans, such as hs, as attachment factors. for example, the initial interaction of herpes simplex virus with cells involves binding to hs chains on one or more hs proteoglycans (shieh et al., ; wudunn and spear, ) through the interactions with the viral glycoproteins gb and gc. viral entry requires the interaction of a specific structure in hs with a third viral glycoprotein, gd (shukla et al., ) , working in concert with membrane proteins related to tnf/ngf receptors (montgomery et al., ) . similarly, the human immunodeficiency virus binds to hs by way of the v loop of the viral glycoprotein gp (roderiquez et al., ) , but infection requires the chemokine receptor ccr (deng et al., ; dragic et al., ) . other coronaviruses also utilize hs, for example nl (hcov-nl ) binds hs via the viral s protein in addition to ace (lang et al., ; milewska et al., ; milewska et al., ; naskalska et al., ) . in these examples, initial tethering of virions to the host cell plasma membrane appears to be mediated by hs, but infection requires transfer to a proteinaceous receptor. the data presented here shows that sars-cov- requires hs in addition to ace . we imagine a model in which cell surface hs acts as a "collector" of the virus and a mediator of the rbd-ace interaction, making viral infection more efficient. hs varies in structure across cell types and tissues, as well as with gender and age (de agostini et al., ; feyzi et al., ; ledin et al., ; vongchan et al., ; warda et al., ; wei et al., ) . variation in competition by hs from different tissues supports this conclusion and raises the possibility that hs contributes to the tissue tropism and j o u r n a l p r e -p r o o f the susceptibility of different patient populations, in addition to levels of expression of ace . coronaviruses can utilize a diverse set of glycoconjugates as attachment factors. human coronavirus oc (hcov-oc ) and bovine coronavirus (bcov) bind to -n-acetyl- -oacetylneuraminic acid (hulswit et al., ; tortorici et al., ) , middle east respiratory syndrome virus (mers-cov) binds -n-acetyl-neuraminic acid (park et al., ) , and guinea fowl coronavirus binds biantennary di-n-acetyllactosamine or sialic acid capped glycans (bouwman et al., ) . whether sars-cov- s protein binds to sialic acid remains unclear. mapping the binding site for sialic acids in other coronavirus s proteins has proved elusive, but modeling studies suggest a location distinct from the hs binding site shown in fig. (park et al., ; tortorici et al., ) . the s protein in murine coronavirus contains both a hemagglutinin domain for binding and an esterase domain that cleaves sialic acids that aids in the liberation of bound virions (rinninger et al., ; smits et al., ) . whether sars-cov- s protein, another viral envelope protein, or a host protein contributes to hs-degrading activity to aid in the release of newly made virions is unknown. the repertoire of proteins in organisms that bind to hs make up the so called "hs interactome" and consists of a variety of different hs-binding proteins (hsbps) (xu and esko, ) . unlike lectins that have a common fold that helps define the glycan binding site, hsbps do not exhibit a conserved motif that allows accurate predictions of binding sites based on primary sequence. instead, the capacity to bind heparin appears to have emerged through convergent evolution by juxtaposition of several positively charged amino acid residues arranged to accommodate the negatively charged sulfate and carboxyl groups present in the polysaccharide, and hydrophobic and h-bonding interactions stabilize the association. the rbd domains from the sars-cov- and sars-cov- s proteins are highly similar in structure (fig. g ), but the electropositive surface in sars-cov- s rbd is not as pronounced in sars-cov- s rbd (fig. h ). in accordance with this observation, recombinant rbd protein from sars-j o u r n a l p r e -p r o o f cov- showed significantly higher binding to heparin-bsa, compared to rbd from sars-cov- (fig. b) . a priori we predicted that the evolution of the hs binding site in the sars-cov- s protein might have occurred by the addition of arginine and lysine residues to its ancestor, sars-cov- . instead, we observed that four of the six predicted positively charged residues that make up the heparin-binding site are present in sars-cov- as well as most of the other amino acid residues predicted to interact with heparin ( fig. ) . sars-cov- has been shown to interact with cellular hs in addition to its entry receptors ace and transmembrane protease, serine (tmprss ) (lang et al., ) . our analysis suggests that the putative heparinbinding site in sars-cov- s may mediate an enhanced interaction with heparin compared to sars-cov- , and that this change evolved through as few as two amino acid substitutions, thr lys and glu asn. further studies are underway to define the amino acid residues in the combining site for heparin/hs to test this hypothesis. the ability of heparin and hs to compete for binding of the sars-cov- s protein to cell surface hs and the inhibitory activity of heparin towards infection of pseudovirus and authentic sars-cov- illustrates the therapeutic potential of agents that target the virus-hs interaction to control infection and transmission of sars-cov- . there is precedent for targeting proteinglycan interactions as therapeutic agents. for example, tamiflu targets influenza neuraminidase, thus reducing viral transmission, and sialylated human milk oligosaccharides can block sialic acid-dependent rotavirus attachment and subsequent infection in infants (hester et al., ; von itzstein, ) . covid- patients typically suffer from thrombotic complications ranging from vascular micro-thromboses, venous thromboembolic disease and stroke and often receive unfractionated heparin or low molecular weight heparin (thachil, ) . the findings presented here and elsewhere suggest that both of these agents can block viral infection (courtney mycroft-west, ; kim et al., ; liu et al., ; mycroft-west et al., ; tandon et al., ; wu et al., ) . effective anticoagulation is achieved with plasma levels of heparin of . - . units/ml. this concentration is equivalent to . - µg/ml heparin (assuming that the activity of ufh is units/mg). although this is sufficient to block spike protein binding to cells (fig. ) , it would not be expected to prevent viral infection, but it should attenuate infection depending on the viral load (fig. ) . the anticoagulant activity of heparin, which is typically absent in hs, is not critical for its antiviral activity based on the observation that mst derived heparin and split-glycol heparin is nearly as potent as therapeutic heparin ( figs. and ) . additional studies are needed to address the potential overlap in the dose response profiles for heparin as an anticoagulant and antiviral agent and the utility of nonanticoagulant heparins. antibodies directed to heparan sulfate or the binding site in the rbd might also prove useful for attenuating infection. in conclusion, this work revealed hs as a novel attachment factor for sars-cov- and suggests the possibility of using hs mimetics, hs degrading lyases, and metabolic inhibitors of hs biosynthesis for the development of therapy to combat covid- . further information and request for resources should be directed to the lead contact, thomas mandel clausen (tmandelclausen@health.ucsd.edu) all developed sars-cov- expression plasmids produced in this study can be made available upon request to the lead contact. j o u r n a l p r e -p r o o f this study did not generate any unique datasets or code. cell lines nci-h , a , hep b, a and vero e cells were from the american type culture collection (atcc). nci-h and a cells were grown in rpmi medium, whereas the other lines were grown in dmem. hep b cells carrying mutations in hs biosynthetic enzymes were previously derived from the parent hep b line as described (anower et al., ) . all cell media were supplemented with % (v/v) fbs, iu/ml of penicillin and µg/ml of streptomycin sulfate, and the cells were grown under an atmosphere of % co and % air. cells were passaged at ~ % confluence and seeded as explained for the individual assays. protein was produced in expicho or hek - e cells that were acquired from thermo fisher and grown according to the manufacturer's specifications. human bronchial epithelial cells were acquired from lonza. they were cultured in pneumacult-ex plus medium or to pneumacult-ali medium according to the manufacturer's instructions (stemcell technologies). specific details on the culture methods are described in the methods section. the collection of human tissue in this study abided by the helsinki principles and the an electrostatic potential map of the sars-cov- spike protein rbd domain was generated from a crystal structure (pdb: m ) and visualized using pymol (version . . by schrödinger). a dp fully sulfated heparin fragment was docked to the sars-cov- spike protein rbd using the cluspro protein docking server (https://cluspro.org/login.php) (kozakov et al., ; kozakov et al., ; vajda et al., ) . heparin-protein contacts and energy contributions were evaluated using the molecular operating environment (moe) software (chemical computing group). recombinant sars-cov- spike protein, encoding residues - (wuhan-hu- ; genbank: mn . ) with proline substitutions at amino acids positions and , a "gsas" substitution at the furin cleavage site (amino acids - ), twinstreptag and his x , was produced in expicho cells by transfection of x cells/ml at ºc with . µg/ml of plasmid dna using the expicho expression system transfection kit in expicho expression medium (thermofisher). one day later the cells were refed, then incubated at ºc for days. the conditioned medium was mixed with complete edta-free protease inhibitor (roche). samples of the recombinant trimeric spike protein ectodomain were diluted to . mg/ml in x tbs ph . . carbon coated copper mesh grids were glow discharged and µl of the diluted sample was placed on a grid for sec then blotted off. uniform stain was achieved by depositing µl of uranyl formate ( %) on the grid for sec and then blotted off. grids were transferred to a thermo fisher morgagni operating at kv. images at , magnification j o u r n a l p r e -p r o o f were acquired using a megaview k camera via the radius software. a dataset of micrographs at , x magnification and - . µm defocus was collected on a fei tecnai spirit ( kev) with a fei eagle k by k ccd camera. the pixel size was . Å per pixel and the dose was e − /Å . the leginon (suloway et al., ) software was used to automate the data collection and the raw micrographs were stored in the appion (lander et al., ) database. particles on the micrographs were picked using dogpicker , stack with a box size of pixels, and d classified with relion . (scheres, ) . secreted human ace was transiently produced in suspension hek - e cells. a plasmid encoding residues − of ace with a c-terminal hrv- c protease cleavage site, a twinstreptag and an his x tag was a gift from jason s. mclellan, university of texas at austin. briefly, ml of hek - e cells were seeded at a cell density of . × cells/ml hr before transfection with polyethyleneimine (pei). for transfection, µg of the ace plasmid and µg of pei ( : ratio) were incubated for min at room temperature. transfected cells were cultured for hr and fed with ml fresh media for additional hr before harvest. secreted ace were purified from culture medium by ni-nta affinity chromatography (qiagen). filtered media was mixed : (v/v) in x binding buffer ( mm tris-hcl, ph , , , m nacl) and loaded on to a self-packed column, pre-equilibrated with washing buffer ( mm tris-hcl, ph , . m nacl, mm imidazole). bound protein was washed with buffer and eluted with . m imidazole in washing buffer. the protein containing fractions were identified by sds-page. j o u r n a l p r e -p r o o f sars-cov- spike protein in dpbs was applied to a -ml hitrap heparin-sepharose column (ge healthcare). the column was washed with ml of dpbs and bound protein was eluted with a gradient of nacl from mm to m in dpbs. for binding studies, recombinant spike protein and ace was conjugated with ez-link tm sulfo-nhs-biotin ( : molar ratio; thermo fisher) in dulbecco's pbs at room temperature for min. glycine ( . m) was added to quench the reaction and the buffer was exchanged for pbs using a zeba spin column (thermo fisher). heparin ( and incubated with s protein ( nm). ace binding was measured to bound spike protein as described above. mixtures of stabilized (mut ) spike protein, x molar excess soluble ace ectodomain, with or without x molar excess an icosasaccharide (dp ) fragment derived from heparin were incubated at °c for min or hr. samples were diluted to . mg/ml with respect to spike protein in x pbs ph . . carbon coated copper mesh grids were glow discharged at ma for s and µl sample was applied for s and blotted off. grids were washed five times in µl x tbs ph . for sec then stained and blotted twice with µl % uranyl formate for sec. grids were imaged with an fei tecnai spirit ( kev) or fei tecnai f ( kev) with an fei eagle ccd ( k) camera. data were collected on the fei tecnai f at , x magnification, - . µm defocus with a pixel size of . Å per pixel. these datasets employed a box size of and comprised to micrographs. data were collected on the fei tecnai spirit as described above. data collection on both microscopes was automated through leginon (suloway et al., ) . stored in the appion (lander et al., ) database, and particles were picked with dog picker . particles were d classified with relion . j o u r n a l p r e -p r o o f (scheres, ) . trimeric d classes were selected for iterative d classification with relion . . classifications were performed until d classes demonstrated ace occupancy throughout the relevant threshold-level of the spike protein as visualized using chimerax (goddard et al., ) . particle counts of final d classes were obtained with relion . (scheres, ) and the percentages of particles bound to , , , or ace were calculated and visualized in graphpad prism . cells at - % confluence were lifted with pbs containing mm edta (gibco) and fresh human tissue was washed in pbs, frozen, and lyophilized. the dried tissue was crushed into a fine powder, weighed, resuspended in pbs containing mg/ml pronase with % ethanol (esko, ) . for hs quantification and disaccharide analysis, purified hs was digested with a mixture of heparin lyases i-iii ( mu each) for hr at °c in mm ammonium acetate buffer containing the ace expression plasmid (addgene, plasmid # ) (li et al., ) qpcr mrna was extracted from the cells using trizol (invitrogen) and chloroform and purified using the rneasy kit (qiagen). cdna was synthesized from the mrna using random primers and the superscript iii first-strand synthesis system (invitrogen). sybr green master mix (applied biosystems) was used for qpcr following the manufacturer's instructions, and the expression of tbp was used to normalize the expression of ace between the samples. the qpcr primers used were as follows: ace (human) forward: ' -cgaagccgaagacctgttcta - ' and reverse: ' -gggcaagtgtggactgttcc - '; and tbp (human) forward: ' -aacttcgcttccgctggccc - ' and reverse: ' -gaggggaggccaagccctga - '. to generate the cas lentiviral expression plasmid, . x hek t cells were seeded to a -cm diameter plate in dmem supplemented with % fbs. the following day, the cells j o u r n a l p r e -p r o o f were co-transfected with the pspax packaging plasmid (addgene, plasmid # ), pmd .g envelope plasmid (addgene, plasmid # ), and lenti-cas plasmid (addgene, plasmid # ) (sanjana et al., ) in dmem supplemented with fugene ( µl in µl dmem). media containing the lentivirus was collected and used to infect a wt and a wt cells, which were subsequently cultured with µg/ml and µg/ml blasticidin, respectively, to select for stably transduced cells. a single guide rna (sgrna) targeting ace ( '-tggatacatttgggcaagtg - ') and one targeting b galt ( '-tgacctgctccctctcaacg- ') was cloned into the lentiguide-puro plasmid (addgene plasmid # ) following published procedure (sanjana et al., ) . the lentiviral sgrna construct was generated in hek t cells, using the same protocol as for the cas expression plasmid, and used to infect a -cas and a -cas cells to generate crispr knockout mutant cell lines. after infection, the cells were cultured with µg/ml puromycin to select for cells with stably integrated lentivirus. after d, the cells were serially diluted into -well plates. single colonies where expanded and dna was extracted using the dneasy blood and tissue dna isolation kit (qiagen). proper editing was verified by sequencing (genewiz inc.) and gene analysis using the online ice tool from synthego (suppl. fig. ). vesicular stomatitis virus (vsv) pseudotyped with spike proteins of sars-cov- were generated according to a published protocol (whitt, ) . briefly, hek t, transfected to express full length sars-cov- spike proteins, were inoculated with vsv-g pseudotyped ∆gluciferase or gfp vsv (kerafast, ma). after hr at °c, the inoculum was removed and cells were refed with dmem supplemented with % fbs, u/ml penicillin, µg/ml streptomycin, and vsv-g antibody (i , mouse hybridoma supernatant from crl- ; atcc). pseudotyped particles were collected hr post-inoculation, centrifuged at , × g to remove cell debris and stored at − °c until use. briefly, µl of luciferin lysis solution was added to the cells and incubated for min at room temperature. the solution was transferred to a black -well plate and luminescence was detected using an enspire multimodal plate reader (perkin elmer). data analysis and statistical analysis was performed in prism . fluor labeling kits (invitrogen), respectively. zombie uv™ was used to gate for live cells in the analysis. cells were then analyzed using an ma cell sorter (sony). for days. fresh medium, µl in the apical chamber and µl in the basal chamber, was added daily. at day , the medium in the apical chambers was removed, and the basal chambers were changed every - days with apical washes with pbs every week for days. the apical side of the hbec ali culture was gently washed three times with µl of phosphate buffered saline without divalent cations (pbs-/-). heparinase was added to the apical side for half an hour prior to infection. an moi of . of authentic sars-cov- live virus (usa-wa / (bei resources, #nr- )) in µl total volume of pbs was added to the apical chamber with either dmso, heparinase ( . mu/ml heparin lyase ii, and mu/ml heparin lyase iii (ibex)) or ug/ml of unfractionated heparin. cells were incubated at c and % co for hours. unbound virus was removed, the apical surface was washed and the compounds were re-added to the apical chamber. cells were incubated for another hours at c and % co . after inoculation, cells were washed once with pbs-/-and µl tryple (thermofisher) was added to the apical chamber then incubated for min in the incubator. cells were gently pipetted up and down and transferred into a sterile ml conical tube containing neutralizing medium of dmem + % fbs. tryple was added again for rounds of minutes for a total of min to clear transwell membrane. cells were spun down and resuspended in pbs with zombie uv viability dye for min in room temp. cells were washed once with facs buffer then fixed in % pfa for min at room temp. pfa was washed off and cells were resuspended in pbs. zombie uv™ was used to gate for live cells in the analysis. infection was analyzed by flow cytometry as explained above. cell viability was assessed using the celltiter-blue® assay (promega). briefly, vero cells were seeded into a well plate. the cells were treated with hsase mix ( . mu/ml hsase ii, and mu/ml hsase iii; ibex) or µg/ml ufh for hrs. the viability of the cells using celltiter-blue® was measured according to the manufacturers protocol. briefly, the j o u r n a l p r e -p r o o f celltiter-blue® reagent was added directly to the cell culture and the cells were incubated overnight. fluorescence was read at excitation nm and emission nm, using an enspire multimodal plate reader (perkin elmer). data analysis was performed in prism. the human bronchial epithelial cells were grown at an air-liquid interface as explained above. cell viability after treatment with hsase mix ( . mu/ml hsase ii, and mu/ml hsase iii; ibex) or µg/ml ufh for hrs was measured by adding celltiter-blue® reagent directly to the transwell inserts and developed as explained above. all statistical analyses were performed in prism (graphpad). all experiments were performed in triplicate and repeated as indicated in the figure legends. data was analyzed statistically using unpaired t-tests when two groups were being compared or by one-way anova without post-hoc correction for multiple comparisons. ic values and confidence intervals were determined using non-linear regression using the inhibitor vs. response least squares fit algorithm. the error bars in the figures refer to mean plus standard deviation (sd) values. the specific statistical tests used are listed in the figure legends and in the methods section. experiments were evaluated by statistical significance according to the following scheme; ns: p > . , *: p ≤ . , **: p ≤ . , ***: p ≤ . , ****: p ≤ . . after hr, cell culture supernatants were collected and stored at - °c. virus titers were determined by plaque assays on vero e monolayers greiner bio-one, # ) and rocked for hr at room temperature. the cells were subsequently overlaid with mem containing % cellulose the plaques were visualized by fixation of the cells with a mixture of % formaldehyde and % methanol (v/v in water) for hr. the monolayer was washed once with pbs and stained with . % crystal violet (millipore sigma # v ) prepared in % ethanol the pennsylvania state university, following the guidelines approved by the institutional biosafety committees. human bronchial epithelial cell air-liquid interface generation and infection human bronchial epithelial cells (hbecs, lonza) were cultured in t flasks in plus medium according to manufacturer instructions (stemcell technologies) to generate air-liquid interface (ali) cultures, hbecs were plated on collagen i-coated well transwell inserts with a . -micron pore size (costar, corning) at x cells/ml. cells were maintained for - days in pneumacult-ex plus medium until confluence, then changed to pneumacult-ali medium triglyceride-rich lipoprotein binding and uptake by heparan sulfate proteoglycan receptors in a crispr/cas library of hep b mutants remdesivir for the treatment of covid- -preliminary report guinea fowl coronavirus diversity has phenotypic consequences for glycan and tissue binding heparan sulfate proteoglycans and viral attachment: true receptors or adaptation bias? viruses undersulfated and glycol-split heparins endowed with antiangiogenic activity the coronavirus (sars-cov- ) surface protein (spike) s receptor binding domain undergoes conformational change upon heparin binding identification of a major co-receptor for primary isolates of hiv- hiv- entry into cd + cells is mediated by the chemokine receptor cc-ckr- special considerations for proteoglycans and glycosaminoglycans and their purification order out of chaos: assembly of ligand binding sites in heparan sulfate age-dependent modulation of heparan sulfate structure and function bioengineering murine mastocytoma cells to produce anticoagulant heparin ucsf chimerax: meeting modern challenges in visualization and analysis structural analysis of urinary glycosaminoglycans from healthy human subjects human milk oligosaccharides inhibit rotavirus infectivity in vitro and in acutely infected piglets human coronaviruses oc and hku bind to -o-acetylated sialic acids via a conserved receptor-binding site in spike protein domain a loss of bcl- -expressing t follicular helper cells and germinal centers in covid- stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis initial step of virus entry: virion binding to cell-surface glycans how good is automated protein docking? the cluspro web server for protein-protein docking appion: an integrated, database-driven pipeline to facilitate em image processing inhibition of sars pseudovirus cell entry by lactoferrin binding to heparan sulfate proteoglycans evolutionary differences in glycosaminoglycan fine structure detected by quantitative glycan reductive isotope labeling heparan sulfate structure in mice with genetically modified heparan sulfate production assessing ace expression patterns in lung tissues in the pathogenesis of covid- angiotensin-converting enzyme is a functional receptor for the sars coronavirus proteoglycans and sulfated glycosaminoglycans sars-cov- spike protein binds heparan sulfate in a length-and sequence-dependent manner entry of human coronavirus nl into the cell human coronavirus nl utilizes heparan sulfate proteoglycans for attachment to target cells stable heparin-producing cell lines derived from the furth murine mastocytoma herpes simplex virus- entry into cells mediated by a novel member of the tnf/ngf receptor family heparin inhibits cellular invasion by sars-cov- : structural dependence of the interaction of the surface protein (spike) s receptor binding domain with heparin membrane protein of human coronavirus nl is responsible for interaction with the adhesion receptor structures of mers-cov spike glycoprotein in complex with sialoside attachment receptors localisation and distribution of o-acetylated n-acetylneuraminic acids, the endogenous substrates of the hemagglutinin-esterases of murine coronaviruses, in mouse tissue mediation of human immunodeficiency virus type binding by interaction of cell surface heparan sulfate proteoglycans with the v region of envelope gp -gp improved vectors and genome-wide libraries for crispr screening relion: implementation of a bayesian approach to cryo-em structure determination cell surface receptors for herpes simplex virus are heparan sulfate proteoglycans a novel role for -o-sulfated heparan sulfate in herpes simplex virus entry nidovirus sialate-o-acetylesterases: evolution and substrate specificity of coronaviral and toroviral receptor-destroying enzymes the sweet spot: defining virus-sialic acid interactions automated molecular microscopy: the new leginon system effective inhibition of sars-cov- entry by heparin and enoxaparin derivatives. biorxiv the versatile heparin in covid- structural basis for human coronavirus attachment to sialic acid receptors the war against influenza: discovery and development of sialidase inhibitors structural characterization of human liver heparan sulfate dog picker and tiltpicker: software tools to facilitate particle selection in single particle electron microscopy function, and antigenicity of the sars-cov- spike glycoprotein isolation and characterization of heparan sulfate from various murine tissues site-specific glycan analysis of the sars-cov- spike a comprehensive compositional analysis of heparin/heparan sulfate-derived disaccharides from human serum generation of vsv pseudotypes using recombinant deltag-vsv for studies on virus entry, identification of entry inhibitors, and immune responses to vaccines cryo-em structure of the -ncov spike in the prefusion conformation vaccines and therapies in development for sars-cov- infections initial interaction of herpes simplex virus with cells is binding to heparan sulfate demystifying heparan sulfate-protein interactions structural basis for the recognition of sars-cov- by full-length human ace cov- spike protein interacts with heparan sulfate and ace through the rbd • heparan sulfate promotes spike-ace interaction • sars-cov- infection is co-dependent on heparan sulfate and ace • heparin and non-anticoagulant derivatives block sars-cov- binding and infection in brief provide evidence that heparin sulfate is a necessary co-factor for sars-cov- infection. they show that heparin sulfate interacts with the receptor binding domain of the sars-cov- spike glycoprotein we thank scott selleck (the pennsylvania state university), eugene yeo (uc san diego), john guatelli (uc san diego), mark fuster (uc san diego) and stephen schoenberger (la jolla institute for immunology) for many helpful discussions, and annamaria naggi and giangiacomo torri from the ronzoni institute for generously providing split-glycol heparin. this key: cord- - e lonq authors: cullen, bryan r. title: viral rnas: lessons from the enemy date: - - journal: cell doi: . /j.cell. . . sha: doc_id: cord_uid: e lonq viruses are adept at evolving or co-opting genomic elements that allow them to maximize their replication potential in the infected host. this evolutionary plasticity makes viruses an invaluable system to identify new mechanisms used not only by viruses but also by vertebrate cells to modulate gene expression. here, i discuss the identification and characterization of viral mrna structures and noncoding rnas that have led to important insights into the molecular mechanisms of eukaryotic cells. viruses are adept at evolving or co-opting genomic elements that allow them to maximize their replication potential in the infected host. this evolutionary plasticity makes viruses an invaluable system to identify new mechanisms used not only by viruses but also by vertebrate cells to modulate gene expression. here, i discuss the identification and characterization of viral mrna structures and noncoding rnas that have led to important insights into the molecular mechanisms of eukaryotic cells. the hiv- tat protein activates transcription of the hiv- provirus by recruiting the cellular p-tefb complex to the viral tar rna hairpin. nuclear export of incompletely spliced hiv- transcripts is facilitated by the rre rna structure, which recruits a complex consisting of the viral rev protein and cellular crm . a similar function is performed by the cte rna structure present in mpmv, which recruits the cellular tap nuclear export factor. the cytoplasmic translation of picornaviral mrnas is facilitated by ires elements, and the translation of retroviral and coronaviral mrnas is modulated by sequences that induce ribosomal frameshifting. finally, the translation of both viral and cellular mrnas can be specifically downregulated by virally encoded micrornas. to tar, p-tefb mediates the phosphorylation of negative regulators of transcription elongation and of the carboxy-terminal domain of rnap ii molecules that have initiated transcription of hiv- proviral dna. these phosphorylation events render rnap ii elongation competent and allow it to transcribe the entire viral genome (barboric and peterlin, ; kao et al., ) . in contrast, in the absence of tat or tar, transcription initiation at the ltr promoter still occurs but almost all of these initiating rnap ii molecules fall off the dna template within ~ bp of the transcription start site. analysis of tat function led to the realization that not only transcription initiation but also transcription elongation can regulate gene expression levels in animal cells (barboric and peterlin, ) . almost all retroviruses contain a single rnap ii-dependent promoter element in the viral ltr that drives transcription of an initial genome-length rna that also acts as an mrna for translation of the viral gag and pol proteins (cullen, ) . in the case of hiv- , this initial transcript can also be processed into fully spliced transcripts encoding the tat and rev proteins of hiv- as well as the auxiliary protein nef. alternatively, this transcript can be processed into partially spliced mrnas encoding the three other viral auxiliary proteins vif, vpu, and vpr. the hiv- replication cycle, therefore, requires that the single initial viral transcript is exported out of the nucleus in several differentially spliced forms. these include an unspliced form that programs gag and pol expression and that is packaged into virion progeny; partially spliced forms that program expression of env, vif, vpr, and vpu; and fully spliced forms that program expression of tat, rev, and nef (cullen, ) . the difficulty with this scenario is that eukaryotic cells do not normally permit the nuclear export of intron-containing mrnas. almost all cellular mrnas are transcribed as intron-containing pre-mrnas, and these introns are recognized in the nascent transcript by splicing factors, including commitment factors. commitment factors both commit the pre-mrna to the splicing pathway and retain the pre-mrna in the nucleus until all introns are removed (legrain and rosbash, ) . hiv- mrnas rely entirely on cellular factors for appropriate splicing, and intron-containing hiv- transcripts are therefore also retained in the infected cell nucleus by splicing commitment factors. the strategy that hiv- has evolved to circumvent this nuclear retention is dependent on the viral rev protein, which is translated from a fully spliced viral mrna that is constitutively exported from the nucleus. as a result, hiv- mutants lacking a functional rev gene are able to express the proteins encoded by fully spliced viral mrnas, that is, tat, nef, and the defective rev protein itself, but cannot express any of the proteins encoded by incompletely spliced viral mrnas, including gag, pol, and env. the transcripts encoding these viral structural proteins, however, can be detected in the nucleus of cells infected by rev-deficient viruses, where they are either degraded or eventually fully spliced and then exported (cullen, ) . rev function requires a highly structured rna target, located in the hiv- env gene, called the rev response element or rre (malim et al., ) . the rre contains a single, high-affinity rev-binding site and also functions as a scaffold for the multimerization of rev on viral mrnas. rev in turn interacts with a cellular factor called crm that belongs to the karyopherin family of nucleocytoplasmic transport proteins (fischer et al., ) . this interaction is mediated by a leucine-rich motif located toward the carboxyl terminus of rev that was the first nuclear export signal (nes) to be identified and is the prototype of the leucine-rich class of ness. karyopherin function is regulated by the action of a g protein called ran, which, like all g proteins, is active when bound by gtp and inactive when bound by gdp (kohler and hurt, ) . cells contain high levels of a ran-specific g nucleotide exchange factor (gef) in the nucleus and of a ran-specific gtpase activating protein (gap) in the cytoplasm. as a result, ran:gtp is largely nuclear and ran:gdp is mainly cytoplasmic. ran:gtp binds to crm in the nucleus and activates the binding of crm to leucine-rich ness. the ribonucleoprotein complex, consisting of ran:gtp, crm , and rev, that forms on the hiv- rre directs incompletely spliced hiv- transcripts to the nuclear pore complex and then into the cytoplasm, where hydrolysis of the gtp moiety by cytoplasmic gap disassembles this complex. although rev was the first nuclear mrna export factor to be identified, it soon became clear that crm is not required for the nuclear export of most cellular mrnas. in fact, crm is involved largely in the nuclear export of small nuclear rnas (snrnas) and preribosomal subunits, as well as in protein nuclear export (kohler and hurt, ) . so which factors are required for the export of cellular mrnas? an important part of the answer emerged from analysis of a second retrovirus called mason-pfizer monkey virus (mpmv). mpmv has a simpler genomic organization than hiv- and only encodes the three structural proteins picornaviruses and some flaviviruses recruits cellular translation factors and ribosomal subunits to viral translation initiation codons in the absence of an mrna cap gag, pol, and env. nevertheless, mpmv expresses both a genome-length gag/ pol mrna and a spliced env mrna. as mpmv does not encode a rev homolog, how do the incompletely spliced genomic mpmv mrnas reach the cytoplasm? this question led to the discovery of an rna stem-loop structure within the mpmv genome, the constitutive transport element (cte), that mediates the nuclear export of incompletely spliced mrnas in the absence of any viral proteins (bray et al., ) . further analysis revealed that the cte recruits a heterodimer of two cellular proteins, tap and p , that also plays a critical role in the nuclear export of the majority of cellular mrnas (grüter et al., ; kohler and hurt, ) . normally, the tap/p heterodimer is only recruited to mature, fully spliced mrnas. however, the cte is able to prematurely recruit tap/p to partially spliced mrnas and thereby circumvents the nuclear retention of incompletely spliced mpmv mrnas. although ctes have now been defined in several other exogenous and endogenous retroviruses, not all ctes act by directly recruiting tap/p . in particular, the avian leukemia virus cte does not appear to bind to tap or p directly, although tap may be required for its function (leblanc et al., ) . further analysis may reveal new insights into how the export of retroviral nuclear mrnas is regulated. after an mrna is exported to the cytoplasm, it must recruit cellular ribosomes in order for the translation of the encoded open reading frame to occur (figure ). picornaviruses presented two mysteries in terms of how these pathogenic viruses are able to translate the single large polyprotein encoded by their positive-sense rna genome. first, the single genome-length picornavirus mrna is uncapped. second, infection by picornaviruses such as poliovirus results in the efficient translation of viral mrnas, yet cellular mrna translation is largely blocked. so, why is this uncapped viral mrna translated more efficiently than capped cellular mrnas? the key discovery that led to the resolution of this conundrum was the identification of the poliovirus internal ribosome entry site (ires), a ~ nucleotide (nt) highly structured rna element found in the ′ untranslated region ( ′utr) of poliovirus mrnas (jang et al., ; pelletier and sonenberg, ) . the ires directly recruits several eifs and the s ribosomal subunit to an internal viral translation initiation codon without the requirement for either cap binding or ′utr scanning. as a result, poliovirus translation is independent of the host cell cap recognition factor eif e. moreover, although poliovirus translation initiation does require eif g, it functions perfectly well with the carboxy-terminal fragment of eif g that is generated by the proteolytic cleavage of eif g by a virus-encoded protease. because capdependent translation requires fulllength eif g, this cleavage blocks host cell translation, whereas viral mrna translation is not only unimpeded but in fact is enhanced by the access of viral mrnas to the entire pool of available eifs and ribosomal subunits (martinez-salas et al., ) . subsequent work has demonstrated that all picornaviruses as well as some flaviviruses, including hepatitis c virus (hcv), contain ires elements. surprisingly, these exist in several functionally distinct classes. for example, the hcv ires, unlike the poliovirus ires, can directly recruit s ribosomal subunits to the viral internal translation initiation codon in the absence of eifs, although eifs do participate in the process of translation initiation (martinez-salas et al., ). an even more unusual ires is found in cricket paralysis virus (crpv), a picornavirus-like insect virus (jan et al., ; pestova and hellen, ) . the crpv ires not only is able to recruit both the s and s ribosomal subunits to assemble elongation-competent s ribosomes on viral mrnas but also acts as a mimic of met-trnai to permit initiation of the translation of viral capsid proteins in the absence of met-trnai. although ires elements were first discovered in rna viruses, a subset of cellular mrnas are now known to also contain iress. interestingly, iress seem to be especially prevalent in mrnas whose expression is activated by stress, when cap-dependent translation may be inefficient. many ires-containing host mrnas encode proteins that protect cells from stress, whereas the proteins encoded by other ires-containing cellular mrnas seem to be important during apoptosis (bushell et al., ; komar and hatzoglou, ) . another interesting translational phenomenon observed in several virus families, including many species of retroviruses and all coronaviruses, is programmed ribosomal frameshifting (brierley and dos ramos, ) . in retroviruses such as hiv- , frameshifting prevents some ribosomes from terminating translation at the end of the open reading frame (orf) encoding the gag structural protein and instead induces ribosomes to shift into the overlapping pol orf, resulting in the production of the large gag-pol polyprotein. similarly, in coronaviruses, ribosomal frameshifting is used to produce the a/ b replicase polyprotein rather than the shorter a variant. frameshifting is induced by a bipartite element consisting of a ′ frameshifting site and an adjacent ′ rna structure (brierley and dos ramos, ; jacks et al., ) . the frameshift site has the consensus sequence x_xxy_yyz (where the translational phase is indicated), which then slips into the − frame, that is, xxx_yyy_z. the actual shift sites found in hiv- and the sars coronavirus are u_uuu_uua and u_uua_aac, respectively. the ′ rna structure found in hiv- is thought to be a simple rna hairpin but other − frameshifting signals instead contain a pseudoknot ′ to the frameshift signal (brierley and dos ramos, ) . it has been proposed that the function of the ′ rna structure is to induce transient ribosomal pausing at the frameshift site to facilitate ribosomal slippage in the − direction. although frameshifting in hiv- occurs with an efficiency of ~ %, frameshifting efficiency in other viruses can be as high as ~ % and may be facilitated by a direct interaction between the paused ribosome and the downstream pseudoknot structure. programmed frameshifting is not unique to viruses but is found also in a small number of cellular genes in both eukaryotes and bacteria (shigemoto et al., ; tsuchihashi and kornberg, ) . viruses, rna interference, and micrornas rna interference (rnai) was first discovered by genetic analysis in nematodes (fire et al., ) ; however, it is likely that rnai first evolved as an innate immune response to viral infection. indeed, rnai continues to represent a key component of the antiviral response in plants and invertebrates (cullen, ) . the triggers for rnai in these species are the long double-stranded rnas (dsrnas) that form critical intermediates in the replication of all rna viruses except retroviruses. these dsrnas are bound by the rnase iii-related enzyme dicer, which progressively cleaves these dsrnas into ~ bp rna duplexes containing terminal nt ′ overhangs (see review by r.w. carthew and e.j. sontheimer in this issue of cell). one strand of this duplex, called a small-interfering rna (sirna), is then incorporated into the rna-induced silencing complex (risc), where it acts as a guide for rna to target risc to complementary regions of viral genomic, anti-genomic, or mrna species. risc then cleaves these viral rnas, leading to their degradation. the first sirnas to be identified were in fact antiviral sirnas produced in tobacco cells infected by a pathogenic rna virus, potato virus x (hamilton and baulcombe, ) . rnai is a critical component of the antiviral immune response in plants and invertebrates, but emerging evidence indicates that rnai responses to viral infection are not induced in mammalian somatic cells (cullen, ) . instead, mammalian cells have evolved other innate responses that are induced by viral dsrnas, including the interferon response. because of the importance of rnai as an antiviral defense in plants and insects, many rna viruses that infect these species have evolved gene products that inhibit rnai and, hence, enhance virus replication. conversely, the absence of antiviral rnai responses in mammalian cells means that the rnai machinery in these cells generally remains active during viral infection (figure ) . although the role of rnai as an antiviral response appears to have been lost in mammalian somatic cells, the residual rnai machinery still plays a very important role by mediating the function of cellular micrornas (mirnas). unlike sirnas, which are derived from long dsrnas (frequently of exogenous origin), mirnas are encoded within the cell's genome as part of one arm of an ~ nt rna hairpin located in a larger rnap ii transcript called a primary mirna (bartel, ) . after excision, by the sequential action of the host cell rnase iii-related enzymes drosha and dicer, mirnas are loaded into risc and downregulate the expression of cellular mrnas. unlike viral mrna targets of viral sirnas, cellular mrnas are rarely fully complementary to cellular mirnas. as full complementarity is a prerequisite for efficient cleavage by risc, cellular mrnas are generally not subject to degradation by cellular mirnas. instead, cellular mirnas can induce the translational inhibition of cellular mrnas by binding to partially complementary target sites (bartel, ) . as most mammalian viruses do not seem to interfere with the loading or function of risc, mirnas remain active in infected cells, thus offering the possibility for viruses to use the cellular rnai machinery to regulate cellular or viral gene expression by programming risc with viral mirnas. analysis of a range of virally infected cells has revealed that several dna viruses, including herpesviruses, encode multiple distinct mirnas. of note, most viral mirnas appear to be processed by the same drosha and dicer dependent pathway used by the majority of cellular mirnas, although there are a few examples of viral mirnas that are transcribed by rna polymerase iii, not rnap ii, and then excised directly by dicer (gottwein and cullen, ) . similarly, riscs programmed by viral mirnas appear to function in the same way as riscs programmed by cellular mirnas. although beyond the scope of this article, it is interesting to note that several cellular and viral mrna targets of viral mirnas have now been defined (gottwein and cullen, ) . in general, it appears that viral mirnas either downregulate cellular or viral genes that increase the sensitivity of virally infected cells to host innate or adaptive immune responses or, in the case of herpesvirus mirnas, stabilize viral latency by downregulating the expression of the viral immediate early proteins, which favor entry into the lytic replication cycle. in addition to mirnas, a number of dna viruses also encode long noncoding rnas that play a role in regulating viral replication and pathogenesis (table ; reviewed in sullivan and cullen, ; see review by c.p. ponting, p.l. oliver, and w. reik in this issue of cell). but how do viruses use noncoding rnas to promote their replication? one interesting noncoding rna is the latency associated the instability of the . kb lat rna appears to be due to the fact that it is processed into several viral mirnas that may play a key role in regulating hsv- latency (umbach et al., ) . the role of the stable kb lat intron is less clear, but evidence has been presented arguing that the lat intron is exported out of the nucleus by crm and associates with cellular ribosomes, thus suggesting a role in modulating mrna translation in neurons latently infected with hsv- (atanasiu and fraser, ) . another interesting viral noncoding rna is the polyadenylated nuclear (pan) rna encoded by kaposi's sarcomaassociated herpesvirus (kshv). pan is an unspliced rnap ii transcript that is the most highly expressed viral rna during lytic kshv infection, comprising up to % of all viral rnas. remarkably, the function of this rna in the viral life cycle is still unclear. however, recent data demonstrate that pan contains a novel ~ nt-long rna element that stabilizes pan rna in the infected cell nucleus. insertion of this viral rna element in cis also increases the nuclear abundance of cellular mrnas, such as β-globin mrnas, that are normally unstable when expressed in an intronless form (conrad et al., ) . it is unclear whether this element simply stabilizes pan rnas or whether it also acts in trans to stabilize other kshv-coding mrnas, which are also largely intronless. finally, several viral noncoding rnas seem to be inhibitors of cellular innate antiviral immune responses. for example, the β . noncoding rna encoded by human cytomegalovirus (hcmv) binds to mitochondrial enzyme complex i of the host cell. this interaction stabilizes the production of atp in infected cells and also inhibits virally induced apoptosis (reeves et al., ) . another noncoding rna, the va rna expressed by adenovirus, also binds to a cellular factor to inhibit an antiviral response. in this case, the target is protein kinase r (pkr), a cellular protein that binds to the long dsrnas produced by adenoviruses and many other pathogenic viruses. binding of dsrna by pkr induces pkr dimerization and autophosphorylation as well as phosphorylation of cellular eif α, which results in a global inhibition of translation in the infected host cell. va , a highly structured ~ nt-long noncoding rna, binds to pkr with high affinity and blocks pkr dimerization and activation. this prevents the inhibition of translation induced by adenovirus-derived dsrnas and allows virus replication to proceed unimpeded (mathews and shenk, ) . interestingly, both hcmv β . and adenovirus va , like kshv pan and hsv- lat, are also expressed at high levels in infected cells. β . comprises up to % of all viral transcripts in hcmv-infected cells, and adenovirus va is expressed at an extraordinarily high number of copies (~ ) per infected cell. presumably, these high expression levels facilitate the saturation of cellular binding sites for these rnas. efforts to understand the replication cycles of viruses are often motivated by the pathogenic potential of these intracellular parasites. however, such analyses have also led to several key insights into how not only infected cells but also uninfected cells regulate the expression of their genome. moreover, as noted in the brief discussion of viral noncoding rnas, our knowledge of how virally encoded transcripts work in the host cell remains far from complete. clearly, future research into virus replication will provide unexpected and exciting insights into the complex molecular machinery that makes human cells tick. proc. natl. acad. sci. usa proc. natl. acad. sci. usa proc. natl. acad. sci. usa non-coding regulatory rnas of the dna tumor viruses proc. natl. acad. sci. usa proc. natl. acad. sci. usa key: cord- -tysyq o authors: thomas, sheila m.; lamb, robert a.; paterson, reay g. title: two mrnas that differ by two nontemplated nucleotides encode the amino coterminal proteins p and v of the paramyxovirus sv date: - - journal: cell doi: . /s - ( ) - sha: doc_id: cord_uid: tysyq o the “p≓ gene of the paramyxovirus sv encodes two known proteins, p (m(r) ≈ , ) and v (m(r) ≈ , ). the complete nucleotide sequence of the “p≓ gene has been obtained and is found to contain two open reading frames, neither of which is large enough to encode the p protein. we have shown that the p and v proteins are translated from two mrnas that differ by the presence of two nontemplated g residues in the p mrna. these two additional nucleotides convert the two open reading frames to one of amino acids. the p and v proteins are amino coterminal and have amino acids in common. the unique c terminus of v consists of a cysteine-rich region that resembles a cysteine-rich metal binding domain. an open reading frame that contains this cysteine-rich region exists in all other paramyxovirus “p≓ gene sequences examined, which suggests that it may have important biological significance. in recent years the catalog of mechanisms identified as having a role in the processing or modification of the initial rna transcripts to yield mature mrnas has increased markedly. in eukaryotic cells the most common process found is splicing of the primary rna transcript (padgett et al., ) . less common variations on this splicing theme are alternative splicing, such that exons are excluded from some, but not all, of the mature mrna, giving rise to sequence diversity in the encoded proteins (for review, see breitbart et al., ) , and trans splicing, in which two independently transcribed rnas are ligated to form the mature mrna (konarska et al., ; solnick, ; murphy et al., ; sutton and boothroyd, ; krause and hirsh, ; koller et al., ) . mechanisms of rna transcript modification other than splicing include the phenomenon of rna-editing in mitochondrial transcripts from trypanosomes, which is characterized by the presence in the mature mrna of uridine residues that are not encoded in the gene (benne et al., ; feagin et al., feagin et al., , shaw et al., ) . in addition, a process related to rna-editing is thought to occur in primary transcripts of the mammalian apolipoprotein-b gene, as two discrete mrnas have been found, one of which has a u residue in place of a templated c (powell et al., ; chen et al., ) . in animal virus infected cells, many examples of spliced and alternatively spliced viral mrnas have been identified. in addition, an unusual cotranslational modification has been identified in vaccinia virus late transcripts that possess a poly(a) leader at the ' end not encoded by the virus genome (bertholet et al., ; schwer et al., ) . the ' poly(a) region is thought to be added by the vaccinia polymerase "stuttering" at a series of t residues on the template dna (schwer and stunnenberg, ) . in many virus systems, in addition to modification of the primary rna transcripts, the maximum protein coding potential on an mrna is exploited by the use of alternative translation strategies that also have a potential role in the regulation of viral gene expression. such mechanisms include translation from overlapping reading frames as observed for the coat, lysis, and synthetase proteins of bacteriophage ms (atkins et al., ) and the adenovirus e b proteins (bos et al., ) ; ribosomal frameshifting that occurs to yield the gag-pol fusion proteins of rous sarcoma virus, human immunodeficiency virus, or mouse mammary tumour virus (jacks and varmus, ; jacks et al., ; varmus, ) , and that also occurs in the polymerase encoding region of infectious bronchitis virus (boursnell et al., ; brierley et al., ) ; and the use of suppressor trnas to overcome translation termination during translation of the gag-pol fusion protein in moloney murine leukemia virus (yoshinaka et al., ) and the nsp protein of sindbis virus (strauss et al., )~ in negative strand rna viruses, several of the processes discussed above are involved in the regulation of viral gene expression. for instance, in influenza a and b viruses, both spliced and unspliced mrnas that are translated to yield polypeptides from overlapping reading frames have been identified (lamb and lai, ; briedis and lamb, ) , and influenza a viruses also provide an example of alternatively spliced mrnas (lamb et al., ) . translation from overlapping reading frames on functionally bicistronic mrnas has been shown to be the mechanism used to yield the na and nb glycoproteins of influenza b virus (shaw et al., ) and the p and c proteins of the paramyxoviruses sendal virus and parainfluenza virus and the morbillivirus, measles virus (giorgi et al., ; gupta and kingsbury, ; curran et al., ; luk et al., ; galinski et al., ; spriggs and collins, ; bellini et al., ) . simian virus (sv ), a prototype paramyxovirus, has a single-stranded, negative sense genomic rna (vrna) approximately , nucleotides in chain length that is transcribed in infected cells by the virion-associated rna transcriptase to yield virus-specific mrnas. the sv "p" gene has been shown to encode both the p protein (mr = , ), and protein v (mr = , ) by the arrest of translation in vitro of both p and v using a cdna clone derived from sv -specific mrnas (paterson et al., ) . in addition, the p and v proteins have been shown to have tryptic peptides in common, although no precursor-product kinetics could be demonstrated (paterson et al., ; our unpublished data t cc ac gg tta ca aca agc ac tgc c tgc caa cab c c aat cca caa tcc aca at at ccc act at ct abc ttc tcc cca bat bag atc aat aa n . . . . . ctc ata gas aca g c cts aat act gta a tat ttt act tcc caa caa tc aca a aca tcc tct ctt a aa aat aca ata cca cc~ tc aca a cta leu-he~ ~u~thr- , . . . . . . aca tta cca tca a tcc tat are tt aa ctt gc aaa ttt ga aaa aa aat ct at aca cg ttc atc a aa ccc aea gag ~t cct atc ca acc thr- his-gly-ser-$er-arg-asd-pro- u-arg-]le-leu-ser- m-pro-. . . . , fist tcc ccc atc at ttt aa a c ag gat acc c ttc cat aga a a tac tca atc a t t a gat gaa tc aa tc act sag t tee ser-~er-pr~-~ze-a~p-phe-lys-arq- ~y-arg-as~-thr- ~y~ ~y-phe-~is- ÷~ -va~pr~-~-~er-~e-leu-~r~- ~y-a~a- ~y-he-pr~-~|a- |y-$er-i~e- ~u- ~y-ser-thr- ]~ er-as~- ~y- . . . . aat cca tcc tgt tct cca atc acc ct ca gca ag cga ttt gaa tgc act tgt cac cag tgt cca tc act tgc tct aa t t aa cea eat act t . .| act tt at aca ctg tac taa ccc t a ttt ta a figure . nucleotide sequence of clone p - in the mrna sense and the predicted amino acid sequence of the two open reading frames nucleotide is the 'terminal nucleotide of the p and v mrnas. after nucleotide there is a stretch of a residues in clone p - (not shown) that is thought to represent part of the poly(a) tail on the mrna. the amino acid numbering of the + reading frame is adjusted to conform with the residues predicted to exist in the p protein (see figure b ). the sequence has been deposited in the embl/genbank data base (accession no. j ). and is part of the transcriptase complex (buetti and choppin, ) , while protein v is found in infected cells and is of unknown function (peluso et al., ) . the mechanism by which both the sv p and v proteins are encoded by a single gene has been investigated and we report here that p and v are amino coterminal proteins with different c-termini and are encoded by two separate mrnas that differ by two nontemplated nucleotides. to investigate the coding strategy used to express both the p and v proteins from a single gene on the sv virion rna, the complete nucleotide sequence of three independently derived cdna clones (p - , p , and p ) was obtained using both dideoxynucleotide chain-terminating and chemical sequencing methods. the nucleotide sequence of clone p - is presented in figure in the mrna sense. it is nucleotides in length and contains an untranslated region of nucleotides preceding the first aug codon at nucleotides - . primer extension nucleotide sequencing on mrna isolated from sv infected cells indicated that nucleotide is the ' terminal nucleotide of the mrna (data not shown). at the ' end after nucleotide in the different clones, there is a stretch of a residues of variable length. the open reading frame following the first aug codon (reading frame ) is capable of encoding a protein of amino acids. in the + reading frame there is an overlapping open reading frame of amino acids, as illustrated schematically in figure . although either reading frame could encode protein v (mr "~ , ), neither of the open reading frames is apparently large enough to encode p, a protein of m r ~- , (paterson et al., ) , assuming the electrophoretic mobility of p is not aberrant. examination of the predicted amino acid sequences indicates a region from residues - ( amino acids) in the reading frame that contains cysteine residues, whereas the + reading frame only contains cysteine at residue . conversely, the + reading frame contains methionine residues, whereas the reading frame contains only one methionine, in addition to the initiation methionine. to facilitate the elucidation of the coding strategy for p and v, we used monoclonal antibodies specific for the sv p and v proteins, which were generously made available by dr. rick randall (randall et al., ) . the p and v monoclonal antibodies have been assigned to three groups, members of which recognize three nonoverlapping antigenic sites (r. randall, personal communication), here designated groups i, ii, and iii. the proteins immunoprecipitated from sv infected cell lysates by a representative member of each group are shown in figure (left section). it can be seen that while p is immunoprecipitated by antibodies from all three groups, protein v is only recognized by the monoclonal antibody from group i. recognition of both p and v by group i monoclonal antibodies indicates that they have amino acid sequences in common. the identity of the protein indicated by a star, migrating in a po- cysteine. right section: immunoprecipitation of sv -infected cell lysates using the group i monoclonal antibody. it can be seen in this immunoprecipitation that some np coprecipitates with p and v using the group i antibody, probably because l, np, p, and v exist in a complex (randall et al., ; our unpublished data). hn = hemagglutinin-neuraminidase protein, mr ~-- , ; np = nucleoprotein, mr = , ; p = phosphoprotein, mr = , ; m = matrix protein, mr = , ; v = protein v, mr = , ; * = polypeptide of unknown origin. nucleotides figure . antigenic region mapping of the p and v monoclonal antibodies on the p and v gene using t rna polymerase runoff transcripts, in vitro translation, and immunoprecipitation the full-length p - dna insert cloned using xbal linkers (xx) and a hindlll to xbal fragment (hx) containing the ' two-thirds of the p - dna were placed under the control of the t promoter in the plasmid pgem- . the template dna was linearized downstream of the t promoter and the insert dna by endonuclease digestion using ecori (xx and hx) or for truncated transcripts, by digestion of the full-length insert with avail (xa) or clal (xc), which cut at sites within the protein coding region. the runoff transcripts were translated in vitro using a rabbit reticulocyte lysate and the [ ss]methionine labeled proteins immunoprecipitated with the p specific monoclonal antibodies using the method described (erickson and blobel, ) . (a) the immunoprecipitated in vitro translation products were analyzed by sds-page on % gels except for the lanes at the far right, where a % polyacryiamide gel was used (see text). u = uninfected cell lysate labeled with tran[ s]label; i = infected cell lysate labeled with tran[ s]label as a marker lane; iv = proteins translated from poly(a)-containing mrna isolated from sv infected cells; c = in vitro translation carried out in the absence of added mrna; gp i, gp ii, and gp ill indicate the monoclonal antibody used to immunoprecipitate the proteins observed in the region of the gel defined by the vertical lines either side of the antibody. lanes xx = rna transcribed from full-length cloned dna; lanes hx = rna transcribed from hindlll to xbal ' two-thirds fragment; lanes xa = rna transcribed from xbal to avail dna fragment; lanes xc = rna transcribed from xbal to clal dna fragment. x = protein p, + = protein v, • = protein consistent in size with internal initiation at met. ( aa product in figure b ), ~ = protein consistent in size with internal initiation at met. ( aa product in figure b ), ~ = protein consistent in size with internal initiation at met. , ~ and ~ = truncated products with protein synthesis initiated at met. ( aa and aa products respectively in figure b ). (b) schematic representation of the data accumulated in figure a showing the protein products derived from the and + reading frames, their relative number of amino acids assuming that all protein products from the + reading frame are initiated from internal methionine residues, the restriction endonuclease sites used in the generation of the rnas, and a summary of the protein product reactivity with the gp i, ii, and iii monoclonal antibodies. sition between that of p and v, has not been investigated. to examine the relative ability by which the p and v proteins could be selectively radiolabeled, sv -infected cells were labeled with either [ s]methionine or [ s]cysteine ( figure , middle section). in addition, cell lysates were immunoprecipitated with the group i antibody (figure , right section). as shown in figure , protein v was easily detected when labeled with either methionine or cysteine, whereas the p protein, although readily labeled with methionine, was poorly labeled with cysteine. these observations suggest that protein v possesses the cysteine-rich region encoded by the reading frame, while the p protein apparently does not. coding to define the regions of the nucleotide s e q u e n c e encoding the p and v proteins, we used the approach of making synthetic m r n a transcripts, translating the rnas in vitro, and immunoprecipitating the products. the complete coding regions of clone p - (nucleotides - ) and a fragment containing the ' two-thirds of the gene were subcloned into the transcription vector pgem- such that they were under the control of the t r n a polymerase promoter. a series of mrna-sense runoff transcripts were prepared as described in experimental procedures, translated in vitro using a rabbit reticulocyte lysate, and immunoprecipitated using the group i, ii, and iii monoclonal an-tibodies. the results obtained from such an assay are shown in figure a and summarized in schematic form in figure b . interestingly, although the p protein could be translated in vitro using poly(a)-containing mrnas isolated from sv infected cells ( figure a , indicated as x in lanes iv), we were unable to detect the synthesis of the p protein when in vitro runoff transcripts were used to program the cell-free translation system. however, protein v was translated from both synthetic rnas and mrna was isolated from infected cells ( figure a , indicated as +). originally we thought it likely that frameshifting might be involved in the synthesis of p. however, because there is no detectable synthesis of the p protein when t runoff transcripts are used to program the rabbit reticulocyte lysate, whereas p is translated efficiently in vitro when poly(a) containing mrna from infected cells is used, it would seem unlikely that ribosomal frameshifting is involved in the generation of the p protein. in addition to protein v, other in vitro synthesized proteins were observed, particularly during translation of the ' two-thirds of the gene. the apparent size of the additional proteins is consistent with initiation of protein synthesis occurring at internal aug codons in the + open reading frame: methionines ( figure a , closed circle), ( figure a , --*), and ( figure a, ~) . from the sizes of the different protein products derived from the various runoff transcripts, it was possible to map unambiguously the regions on the open reading frames that were recognized by the monoclonal antibodies. in this way, monoclonal antibodies from group i were found to recognize the n-terminal region of the open reading frame, group ii antibodies to recognize a region from the n terminus of the + open reading frame, and group ill antibodies to recognize a c-terminal region of the + open reading frame. in this analysis it should be noted that the largest in vitro translation product recognized by group ii and iii antibodies ( figure a , indicated by a closed circle) was very similar in size to protein v ( amino acids versus amino acids) and had an almost identical electrophoretic mobility on sds-page. however, these polypeptides could be resolved when the samples were analyzed on a % polyacrylamide gel ( figure a , right four lanes). thus, these data indicate that because p and v are immunoprecipitated by the gp i monoclonal antibodies, and v cannot be immunoprecipitated by the gp ii and gp iii monocional antibodies, p and v are amino coterminal and protein v must be the product of the open reading frame. in addition, these data indicate that protein p is derived from amino acid residues encoded by a large part of the + reading frame. to obtain additional evidence that protein v is encoded by the open reading frame and that the stop codon (taa, nucleotides - ) that terminates translation in the reading frame is not an artifact of the cdna cloning procedure, we used an approach involving site-specific mutagenesis. if this stop codon in used to terminate translation of protein v, its elimination should prevent a normal-sized protein v from being synthesized, and a larger protein of amino acids should be found. nucleotides - (taa) in the cloned dna were changed to the triplet gcg (encoding alanine) as described in experimental procedures, the dna containing the mutation was transcribed in vitro, and the resulting synthetic rna translated in a rabbit reticulocyte lysate. as shown in figure a , lanes and , a protein (v*) larger than v ( figure a , lane ) that was recognized by the group i monoclonal antibody (figure b, lanes and ) was synthesized. as no evidence for frameshifting could be obtained, i.e., the inability to translate p in vitro from t transcripts of p - cloned dna, the most plausible mechanism by which p and v are encoded is that a second mrna species that is translated to yield the p protein exists. an insertion or deletion in such a mrna would be expected to occur in the region of overlap between the two reading frames shown in figure . to search for the existence of a second mrna population, nuclease $ protection anal- (lamb and lai, ) ; f, control untreated probe; , no added mrna; - , increasing mrna concentrations in the ratio : : : : . numbers on the left of each panel are nucleotide sizes. a schematic diagram of the probe protected products, their nucleotide sizes, and the position of the uniquely labeled end, which is indicated by a star is shown beneath the autoradiograms. ysis was performed using poly(a)-containing mrna from sv infected cells and two dna fragments from clone p - that spanned the entire region of overlap, one uniquely ' end-labeled at nucleotide and the other uniquely ' end-labeled at nucleotide . with each fragment, two nuclease $ protected labeled fragments were detected, the full-length fragment used as a probe corresponding to a colinear mrna transcript, and a smaller fragment present in %- % abundance (data not shown). these data suggest that a second mrna species exists that is derived from the p and v gene on the sv virion rna and that it has a nuclease $ sensitive site approximately between nucleotides and . to define the location of this site more accurately, two shorter dna fragments were used for the nuclease $ analysis. one dna fragment (n ucleotides to ) was ' end-labeled at nucleotide and the other dna fragment (nucleotides to ) was ' end-labeled at nucleotide . in addition to protection of the probe fragments corresponding to a colinear mrna transcript, both probes protected smaller fragments found in %- % abundance ( figure ). with both the ' end and ' end-labeled probes, smaller protected fragments ( and nucleotides respectively) increased in abundance with increasing concentrations of mrna ( figure ). the size of the protected fragments mapped the region containing the nuclease sl-sensitive region to between nucleotides and . a cdna library derived from sv -infected cv cell mrnas (paterson et al., ) was screened with an oligo-nucleotide probe to isolate cdna clones specific for the p and v mrnas. the nucleotide sequence of p and v specific cdna clones was obtained over the region of overlap between the and + reading frames. in addition to cdna clones having the same sequence as p - ( clones), a second population of cdna clones was isolated ( clones) that differed from p - in containing two additional bases between nucleotides - . the nucleotide sequence over the relevant region of a p mrna clone and a v mrna clone is shown in figure a . it can be seen that whereas the v cdna (p - ) has four g residues between nucleotides and , the p cdna has six g residues (sections p and v, figure a ). although the simplest explanation of the nuclease $ mapping data was that the p mrna was a noncolinear transcript of the p and v gene containing an nucleotide interrupted region, our retrospective explanation for the data is that nuclease $ recognized a two nucleotide mismatch. the ' break point maps precisely to the g region in the v cdna clone, while the ' site did not. in addition to the inherent inaccuracies in measuring the precise size of dna fragments, it is noted that the region ' to the g residues at nucleotides - is at-rich and it may have been sensitive to digestion by the nuclease $ (hansen et al., ) . the two extra g residues cause a switch from the reading frame to the + reading frame, and the predicted amino acid sequences are shown in figure b . these data indicate that the p mrna has the capacity to encode a polypeptide of amino acids initiating at the aug codon at nucleotides - and terminating at the tga codon at nucleotides - . to determine whether the genomic virion rna from which the sv mrnas are transcribed contains four or six c residues complementary to the four or six g residues found in the two mrnas, the sequence of the virion rna (vrna) was obtained as described in experimental procedures. as shown in figure (section vrna), only a single cdna sequence could be detected, and it contained four g residues complementary to four c residues in the vrna. to provide further evidence that the only difference between a p mrna and a v mrna is the presence of two nontemplated g residues, we investigated whether the p protein could be translated from a synthetic rna derived from a p cdna. to facilitate the genetic manipulation, an internal large restriction fragment spanning the region of interest in the p - pgem- vector was replaced with the comparable fragment from the p cdna clone. in addition to using a "natural" p cdna, we also changed the v cdna clone p - by site-specific mutagenesis to insert two additional g residues into the four g residues at nucleotides - . synthetic rna was transcribed from both the "natural" and the "synthetic" p cdna clones using t rna polymerase, and translated in vitro in rabbit reticulocyte lysates. as shown in figure , both the "natural" and "synthetic" p cdna clones yielded a p protein with an electrophoretic mobility identical to that of the p protein synthesized in infected cells and to the p protein translated in vitro from sv -infected cell mrna. all the p the nucleotide sequences of a p cdna clone and a v cdna clone in the region of nucleotides - are shown to illustrate the six g or four g residues in the p cdna and v cdna respectively. sequencing was done by the chemical cleavage method (maxam and gilbert, ) . the sequence of the sv genomic template rna (vrna) is shown in the message sense as determined by dideoxy primer extension sequencing using reverse transcriptase (air, ) . (b) the predicted amino acid sequence of the p and v proteins in the region of the six g or four g residues. figure . expression of the p protein from in vitro synthesized rna a p cdna clone was reconstructed in the p - pgem vector by replacing an internal large psti-dna fragment (nucleotides - ) with that from a p cdna containing the two nontemplated g residues between nucleotides - . the p - dna was also changed by site-specific mutagenesis to insert two additional g residues into the four g residues at nucleotides - , and the mutated dna subcloned into the pgem- vector. rna was transcribed with t rna polymerase from both the "natural" and the "synthetic" p cdna clones and translated in vitro using rabbit reticulocyte lysates. lane = sv infected cv cell lysate as a marker. in vitro translated rnas were as follows: lane = no rna control; lane = poly(a)-containing mrnas from sv -infected cv cells; lanes and = "synthetic" p rna from site-specifically mutated template dnas; lane = "natural" p rna; lane = v rna from clone p - . dashes = proteins p and v. arrowhead and dot indicate protein products thought to originate from initiation at internal methionine residues and respectively. proteins could be immunoprecipitated with the group i, ii, and iii monoclonal antibodies (data not shown). the protein products found in figure , lanes - , indicated by an arrow and a dot, are thought to be internal initiation products from methionine residues and respectively. protein v (figure , lanes and ) is of a slightly different electrophoretic mobility from the smaller internal initiation product, and only protein v and not the internal initiation products are precipitated by the group i monoclonal antibodies (data not shown). the finding of two extra g residues at a precise location in the p mrna suggests that a signal would be needed to specify their addition, such as a region of strong secondary structure in the vrna or mrna. with the aid of the computer program fold (intelligenetics inc., palo alto, ca), the most stable secondary structure that can be predicted for nucleotides - of the p and v gene is one with an energy of •g = - . kcal/mol and has the four templated c residues (nucleotides - ) immediately after a base-paired stem region (figure ). the klenow fragment of e. coil dna polymerase often yields artifactual sequencing bands at a run of several g residues when directly sequencing double-stranded dna using the dideoxy chain-terminating method. as shown in figure the sequence of nucleotides - in the v clone ( g residues) is easier to interpret than nucleotides - in the p clone ( g residues). these artifactual bands can be eliminated when the sequencing is performed with a modified form of t dna polymerase (sequenase tm) in conjunction with dltp instead of dgtp, unless there is a strong secondary structure in the template strand and then the artifacts are exacerbated (tabor, ) . when this was done for the p clone dna ( left: the sv virion rna sequence from nucleotides - of the p/v gene was examined for regions of strong secondary structure with the aid of the computer program fold (inteiligenetics inc., pals alto, ca). the stemloop structure shown has an energy of ag = - . kcal/mol. the four c residues at nucleotides - are boxed. the arrow denotes the direction of mrna transcription. right: nucleotide sequences obtained by the dideoxynucleotide chain-terminating method using the klenow fragment of e. coil dna polymerase or a modified form of t dna polymerase (sequenase tm) on a p clone cdna template (klenow and tt) or a v clone cdna template (klenow). in the sequenase reactions (t ) dltp was used in place of the usual dgtp. the region of the four or six g residues between nucleotides - is indicated by a star. ) or v clone dna (data not shown), the t dna poiymerase nearly stopped its processive synthesis at nucleotides - , which suggests that there is a native secondary structure in this region. we have obtained the nucleotide sequence of the paramyxovirus sv p and v gene and have determined the strategy by which both proteins are expressed by a single gene. the p and v proteins are translated from two independent mrnas that are synthesized in sv infected cells and are found to differ by the presence in the p mrna of two additional nucleotides. a comparison of the nucleotide sequences of the p and v cdnas and the sv genomic vrna showed that the two additional g residues present in the p mrna are not templated by the sv virion rna (figure ). it could be argued that the vrna sequencing might not detect a minor vrna species of less than % abundance. however, there is no biological evidence for the involvement of more than one virus genome in the sv infectious cycle. using a combination of in vitro translation of t runoff transcripts, immunoprecipitation of the in vitro synthesized proteins using monoclonal antibodies, oligonuc!eotide-directed mutagenesis, and metabolic labeling of sv infected cell proteins using specific amino acids ([ s]methionine or [ s]cysteine), we have shown that p and v are amino coterminal proteins that have different c-termini. the results presented here confirm earlier observations that the p and v proteins of sv have tryptic peptides in common (paterson et al., ) . thus sv differs from many paramyxoviruses and morbilliviruses that use functionally bicistronic mrnas to synthesize the p protein, and a second protein known as c from overlapping reading frames (giorgi et al., ; galinski, et al., ; bellini et al., ; barrett et al., ) . early peptide mapping data obtained for the p and "c-like" proteins of two other paramyxoviruses, newcastle disease virus (ndv) and mumps virus, suggested that both proteins are encoded by the same reading frame (collins et al., ; herrler and compans, ) . recently the ndv and mumps virus p genes have been sequenced and found to contain one open reading frame (sato et al., ; mcginnes et al., ; takeuchi et al., ) from which it has been suggested that both the p and "c-like" proteins are derived, with the "c-like" protein arising from initiation at an internal aug codon (mcginnes et al., ) . sv is therefore seemingly unique among paramyxoviruses in having two mrnas transcribed from the p gene. the rna-dependent rna poiymerase of negative strand rna viruses functions as part of a transcriptase complex composed of the template vrna in tight association with the nucleoprotein (np), and the p and l proteins, which are thought to be responsible for the polymerase activity (buetti and choppin, ; hamaguchi et al., ) . transcription of the virus-specific mrnas by the transcriptase complex is believed to occur entirely in the cytoplasm of infected cells and is independent of host-cell mrna synthesis. the mechanism responsible for the addition of the untemplated g residues present in the p mrna is unknown, nor is it known whether it is a cotranscriptional or posttranscriptional process. however, the virus-encoded rna polymerase of negative strand rna viruses is also responsible for the polyadenylation of virus-specific mrnas, a process that is thought to occur by a "slippage" or "stuttering" mechanism involving the reiterative copying by the polymerase of a stretch of u residues located at the end of each gene. as the nontemplated g residues are added to the p transcript at a position where the template vrna has a run of four c residues, it is possible that the sv polymerase "stutters" while copying this region of the genome and thus adds the nontemplated nucleotides. it is interesting that immediately upstream of the four c residues on the $v genomic rna is the sequence the published nucleotide sequences of the p genes of several paramyxoviruses and the morbilivirus, measles virus, were translated in all three reading frames. in each case of a reading frame overlapping that for the p protein a cysteine-rich region was identified and is listed in the single letter amino acid code. only the region of significant conservation of sequence is shown with its corresponding nucleotide number; the n-terminal region of the open reading frame is omitted. the star at the end of the amino acid sequence represents a translation termination codon. the boxes identify positions where three or more amino acids have been conserved in all six viruses, a dash indicates that a gap was placed in the alignment, and the star above the sequences shows the seven conserved cysteine residues. sources for the p gene nucleotide sequences are as follows: sv , this publication; mumps virus, takeuchi et al., ; ndv, sato et al., ; sendai virus, shioda et al., and giorgi et al., ; parainfluenza virus (pi- ), galinski et al., , and luk et al., ; measles virus, bellini et al., . '-aaaauucu- ' (figure ), which resembles the putative polyadenylation signal found at the end of sv genes and in fact is identical to the sequence at the end of the sv hn gene (hiebert et al., ) , making this an attractive model for the mechanism by which the nontemplated gs are added. however, it cannot be ruled out that the nontemplated g residues in the p mrna are added as a consequence of some form of rna-editing analogous to that found in mitochondrial transcripts in trypanosomes (benne et al., ; feagin et al., feagin et al., , shaw et al., ) or the mammalian apolipoprotein-b mrna (powell et al., ; chen et al., ) . while screening the sv cdna library for a p cdna, clones were sequenced across the region described above and only clones with either four or six g residues were found. this would suggest that whatever the mechanism involved in the addition of the nontemplated g residues, it is extremely specific. with the aid of the computer algorithm fold, a region of secondary structure was predicted for this part of the template rna and it is therefore possible that this could play a role in either the mechanism itself or its regulation (figure ). an examination of the predicted amino acid sequences of the p and v proteins reveals several interesting features. as mentioned above, the p and v proteins are amino coterminal and have their first residues in common (figure ). an unusual feature of the shared region is the large number of proline residues; prolines in amino acids (figure ) . however, the most striking characteristic observed in either protein is the c-terminal portion of protein v, which consists of a cysteine-rich region bearing a remarkable resemblance to cysteine-rich regions found in the adenovirus e a protein (for review see moran and matthews, ) , the yeast transcription factor gal (johnston and dover, ) , and proteins belonging to the steroid hormone receptor superfamily (for review see evans, ) . in these proteins and others possessing a similar domain it is thought that the binding of metal ions by the cysteine-rich region plays an important role in either the binding of nucleic acid by the protein, mediat-ing protein-protein interactions, or stabilizing oligomeric forms of a protein, as in the tat protein of human immunodeficiency virus (frankel et al., ) . because of the significance of the cysteine-rich regions in other proteins, it was of interest to determine whether the sequence identified here in protein v had been conserved among other paramyxovirus p genes. consequently, we compared the cysteine-rich region from protein v with the protein sequences predicted in all three reading frames from the nucleotide sequence of the p genes from mumps virus, ndv, sendai virus, parainfluenza virus , and measles virus (takeuchi et al., ; sato et al., ; mcginnes et al., ; shioda et al., ; giorgio et al., ; galinski et al., ; luk et al., ; bellini et al., ) . as shown in figure , a highly conserved cysteine-rich region was identified in an open reading frame in all the different paramyxovirus p gene sequences examined. interestingly, the cysteine-rich region is more conserved between the different paramyxoviruses than is the amino acid sequence of the p protein encoded by the same nucleotides but translated in another reading frame (data not shown). as the p protein is part of the paramyxovirus transcriptase complex, the conservation of the cysteine-rich region must have important biological significance. it will be important to determine whether a protein containing this cysteine-rich region is synthesized in cells infected with other paramyxoviruses in addition to the already identified p or p and c proteins derived from the "p" gene. the function of protein v has yet to be elucidated. however, as v is found associated with purified sv virions (our unpublished data) and as group i antibodies precipitate l, np, p, and v in a complex (randall et al., ; our unpublished data) , it remains a possibility that protein v may play a role in transcription and/or replication of the virus genome in infected cells. monolayer cultures of a variant of the mdbk line of bovine kidney cells and the tc clone of cv- cells were grown in dulbecco's modified eagle's medium (dmem) supplemented with % fetal calf serum. stock virus was grown in mdbk cells infected with the w strain of sv (choppin, ) as described previously (peluso et al., ) . for all biochemical experiments, cv- cells were used and infected as described previously (paterson et al., ) , except that for metabolic labeling of infected cell proteins, monelayers were incubated in methionine-and cysteine-free dmem and proteins labeled using either tran[ s]label (icn radiochemicals, irvine, ca), s]cysteine or [ s]methionine (amersham corp., arlington heights, il). messenger rnas were isolated as described previously (paterson et al., ) . cdna synthesis, isolation of sv specific clones, and the identification of cdna encoding the various viral gene products has been described (paterson et al., ) . three clones, p , p , and p - were sequenced over their entire length both by the chemical cleavage method (maxam and gilbert, ) and after subcloning into the pstl site of the replicative form of bacteriophage m mp , by the dideoxy chain-termination method (sanger et al., ) . dideoxy primer extension sequencing on purified sv genomic rna and poly(a)-containing mrna was performed using avian myeloblastosis virus reverse transcriptase (molecular genetic resources, tampa, fl) and p gene specific primers as described previously (air, ) . direct sequencing of double-stranded plasmid dna was carried out by the dideoxy chaintermination method using the klenow fragment of e. coil dna polymerase (bethesda research laboratories, gaithersberg, md) as described by sanger et al. ( ) or a modified form of t polymerase (sequenase tm, united states biochemical corp., cleveland, oh) according to the manufacturer's instructions. restriction endonucleases, bacterial alkaline phosphatase, and t dna ligase were obtained from bethesda research laboratories, and t polynucleotide kinase from pharmacia fine chemicals (piscataway, nj). oligonucleotides were synthesized by the northwestern university biotechnology facility on an applied biosystems (foster city, ca) model b dna synthesizer and were purified as described (paterson and lamb, ) . the p - cdna was excised from pbr by hhal and mstll digestion, thereby eliminating the g/c tails introduced during cdna cloning; xbal linkers were added and the cdna subcloned into the xbai site of pgem- (promega biotec, madison, wi). deletion of the ' end of the gene was performed by digesting pgem- containing the p - cdna with xbai and hindlll, isolating the ' portion of the gene, addition of xbal linkers, and subcloning back into pgem- . to construct both the protein v stop codon elimination mutant and the frameshift mutant, p - cdna was subcloned into the xbal site of the replicative form of bacteriophage m mp , oligonucleotide-directed mutagenesis was carried out according to the procedure of zoller and smith ( ) using mutagenic oligonucleotides consisting of nucleotides either side of the site of the mutation. dna containing the desired mutation was subcloned into pgem- and the mutation verified by direct plasmid dna sequencing using the dideoxy chain termination method (sanger et al., ) and a p specific oligonucleotide primer. for transcription of the entire coding region, plasmid dnas were linearized downstream of the t promoter and the p or v insert, using ecori. for the synthesis of truncated forms of the rnrna, the dna template was linearized using either avail or clal, which recognize sites within the coding region of the cdna. in vitro synthesis of mrna was carried out as described previously (hull et al., ) and ~g of rna was used to program a rabbit reticulocyte tysate as described below. t dna-dependent rna polymerase was obtained from bethesda research laboratories, rnasin tm and rq dnase tm from promega biotec, and r"g( ~)ppp( g (sodium salt) was from pharmacia fine chemicals. in vitro translation of mrnas mrnas were translated in vitro using a micrococcal nuclease-treated rabbit reticulocyte lysate (promega biotec) according to the manufacturer's instructions. the in vitro-synthesized products were labeled using [ s]methionine. one-fifth volume of each translation reaction was immunoprecipitated as described below. immunoprecipitation was performed as previously described (lamb et al., ; erickson and blobel, ) using monoclonal antibodies to the p and v proteins kindly provided by dr. rick randall (randall et al., ) . samples were prepared for electrophoresis and analyzed by sds-page on % polyacrylamide gels as previously described (lamb et al., ) . poly(a)-containing mrnas from sv infected cv- cells were isolated as described (paterson et al., ) . to determine whether more than one mrna is transcribed from the p gene nuclease, sl analysis was performed as previously described (lamb and lai, ) . the labeled dna fragments used as probes were: a hhai-avall fragment and a bamhi-pstl dna fragment (nucleotides - and - , respectively) ' uniquely labeled at nucleotides and , and a hhai-avall fragment and a bamhi-hphl fragment (nucleotides - and - , respectively) ' uniquely labeled at nucleotides and . nuclease $ was obtained from boehringer mannheim biochemicals, indianapolis, in. nucleotide sequence coding for the "signal peptide" and n terminus of the hemagglutinin from an asian (h n ) strain of influenza virus binding of mammalian ribosomes to ms phage rna reveals an overlapping gene encoding a lysis function nucleotide sequence of the entire protein coding region of canine distemper virus polymerase-associated (p) protein mrna measles virus p gene codes for two proteins major transcript of the frarneshift coxll gene from trypanosome mitochondria contains four nucleotides that are not encoded in the dna vaccin(a virus produces late mrnas by discontinuous synthesis the . kb elb rnrna of human ad and ad codes for two tumor antigens starting at different aug triplets completion of the sequence of the genome of the coronavirus avian infectious bronchitis virus alternative splicing: a ubiquitous mechanism for the generation of multiple protein isoforms from single genes influenza b virus genome: sequences and structural organization of rna segment and the mrnas coding for the nsi and ns proteins an efficient ribosome frarneshifting signal in the polymerase-encoding region of the coronavirus ibv the transcriptase complex of the paramyxovirus sv apolipoprotein b- is the product of a messenger rna with an organ-specific in-frame stop codon multiplication of a myxovirus (sv ) with minimal cytopathic effects and without interference coding assignments of the five smaller m rnas of newcastle disease virus ribosomal initiation at alternate augs on the sendal virus pic mrna early events in the biosynthesis of the lysosomal enzyme cathepsin the steroid and thyroid hormone receptor superfamily extensive editing of the cytochrome c oxidase iii transcript in trypanosoma brucei developmentally regulated addition of nucleotides within the apocytochrome b transcripts in trypanosoma brucei tat protein from human immunodeficiency virus forms a metal-linked dimer molecular cloning and sequence analysis of the human parainfluenza virus mrna encoding the p and c proteins sendal virus contains overlapping genes expressed from a single mrna translational modulation in vitro of a eukaryotic viral mrna encoding overlapping genes: ribosome scanning and potential roles of conformational changes in the p/c mrna of sendal virus transcriptive complex of newcastle disease virus. i. both l and p proteins are required to constitute an active complex t antigen repression of sv early transcription from two promoters synthesis of mumps virus polypeptides in infected vero cells hemagglutininneuraminidase protein of the paramyxovirus simian virus ; nucleotide sequence of the mrna predicts an n-terminal membrane anchor integration of a small integral membrane protein, m , of influenza virus into the endoplasmic reticulum: analysis of the internal signal-anchor domain of a protein with an ectoplasmic nh terminus expression of the rous sarcoma virus pol gene by ribosomal frameshifting two efficient ribosomal frameshifting events are required for synthesis of mouse mammary tumour virus gag-related polyproteins mutations that inactivate a yeast transcriptional regulatory protein cluster in an evolutionary conserved dna binding domain evidence for in vivo trans splicing of pre-mrnas in tobacco chloroplasts trans splicing of mrna precursors in vitro a trans-spliced leader sequence on actin mrna in c. elegans sequence of interrupted and uninterrupted mrnas and cloned dna coding for the two overlapping nonstructural proteins of influenza virus spliced and unspliced messenger rnas synthesized from cloned influenza virus m dna in an sv vector: expression of the influenza virus membrane protein (m ) evidence for a ninth influenza viral polypeptide sequences of mrnas derived from genome rna segment of influenza virus: colinear and interrupted mrnas code for overlapping proteins messenger rna encoding the phosphoprotein (p) gene of human parainfluenza virus is bicistronic sequencing end-labeled dna with base-specific chemical cleavages the p protein and the non-structural k and k proteins of newcastle disease virus are derived from the same open reading frame multiple functional domains in the adenovirus e a gene identification of a novel y branch structure as an intermediate in trypanosome mrna processing: evidence for trans splicing splicing of messenger rna precursors ability of the hydrophobic fusion-related external domain of a paramyxovirus f protein to act as a membrane anchor analysis and gene assignment of mrnas of a paramyxovirus, simian virus polypeptide synthesis in simian virus infected cells a novel form of tissue-specific rna processing produces apolipoprotein-b in intestine isolation and characterization of monoclonal antibodies to simian virus and their use in revealing antigenic differences between human, canine and simian isolates dna sequencing with chain-terminating inhibitors molecular cloning and nucleotide sequence of p, m and f genes of newcastle disease virus avirulent strain d vaccinia virus late transcripts generated in vitro have a poly(a) head discontinuous transcription or rna processing of vaccinia virus late messengers results in a ' poly(a) leader a previously unrecognized influenza b virus glycoprotein from a bicistronic mrna that also encodes the viral neuraminidase editing of kinetoplastid mitochondrial mrnas by uridine addition and deletion generates conserved amino acid sequences and aug initiation codons sequence of , nucleotides from the ' end of sendai virus genome rna and the predicted amino acid sequences of viral np, p and c protein trans splicing of mrna precursors sequence analysis of the p and c protein genes of human parainfluenza virus type : patterns of amino acid sequence homology among paramyxovirus proteins sequence coding for the alphavirus nonstructural proteins is interrupted by an opal termination codon evidence for trans splicing in trypanosomes sequenasetm: step-by-step protocols for dna sequencing with sequenase tm. united states biochemical corporation molecular cloning and sequence analysis of mumps virus gene encoding the p protein: mumps virus p gene is monocistronic murine leukemia virus protease is encoded by the gag-pol gene and is synthesized through suppression of an amber termination codon oligonucleotide-directed mutagenesis using m -derived vectors: an efficient and general procedure for the production of point mutations in any fragment we thank margaret a. shaughnessy for excellent technical assistance and rick e. randall of st. andrews university, st. andrews, scotland for kindly providing the monoclonal antibodies to p and v. this research was supported by national institutes of health research grants ai- and ai- . during the course of this work, r. a. l was an established investigator of the american heart association.the costs of publication of this article were defrayed in part by the payment of page charges. this article must therefore be hereby marked "advertisement" in accordance with u.s.c. section solely to indicate this fact. key: cord- -w j n mn authors: gibson, erin m.; bennett, f. chris; gillespie, shawn m.; güler, ali deniz; gutmann, david h.; halpern, casey h.; kucenas, sarah c.; kushida, clete a.; lemieux, mackenzie; liddelow, shane; macauley, shannon l.; li, qingyun; quinn, matthew a.; roberts, laura weiss; saligrama, naresha; taylor, kathryn r.; venkatesh, humsa s.; yalçın, belgin; zuchero, j. bradley title: how support of early career researchers can reset science in the post-covid world date: - - journal: cell doi: . /j.cell. . . sha: doc_id: cord_uid: w j n mn the covid crisis has magnified the issues plaguing academic science, but it has also provided the scientific establishment with an unprecedented opportunity to reset. shoring up the foundation of academic science will require a concerted effort between funding agencies, universities, and the public to rethink how we support scientists, with a special emphasis on early career researchers. the covid crisis has magnified the issues plaguing academic science, but it has also provided the scientific establishment with an unprecedented opportunity to reset. shoring up the foundation of academic science will require a concerted effort between funding agencies, universities, and the public to rethink how we support scientists, with a special emphasis on early career researchers. the novel coronavirus, sars-cov- , has placed science at the center of every conversation, amplifying the importance of scientific research to economic stability, healthcare infrastructure, and disaster preparedness. in academic science, recovery from the immediate covid crisis will require departments, universities, private foundations, federal agencies, and the public to work together collaboratively and comprehensively. the goal of recovery should not be to return to ''normal'' but, rather, to reset. here, we argue that recovery provides us with the opportunity to address three systemic issues that plague the conduct of research in the twenty-first century, with an emphasis on supporting early career researchers who are the most vulnerable. the strategies needed to ensure stability and success of early career scientists post-covid can be adapted to chip away at the systemic issues affecting the scientific establishment. science has changed immensely over the past years. more has become better: more experiments per paper, more papers per year, more expectations and requirements for grants and tenure, more opinions from reviewers. the scientific community rewards quantity over quality. most scientists can easily name a seminal paper; many were published long before the s, and many had, at most, a handful of figures. today, papers are often published with a plethora of supplemental figures that will largely go unread and underappreciated. the desire for ''more'' results in delays in publication, the awarding of grants, and career advancement for early career researchers; it also stymies creativity and encourages the proliferation of low-quality journals. this crisis is exacerbating the well-documented discrimination afflicting academic science (monroe et al., ) . women, parents, and individuals who identify as racial or ethnic minorities leave science, technology, engineering, and math (stem) fields as early career researchers at an excessively high rate in the best of times and will undoubtedly suffer more from the present lab closures. the responsibilities of family life disproportionately impact women. a parent who is trying to homeschool their children, manage household duties, and work will have left little time to further their own scientific agenda. faculty with family responsibilities-women specificallymust be supported. the covid crisis will only highlight the rampant diversity issues plaguing the scientific establishment, many of which begin with the loss of women and minorities during early career stages and may lead to further disenfranchisement of the disadvantaged (malisch et al., ) . the current model of academic science is heavily reliant upon federal funding, even though agencies such as the national institutes of health (nih) were not built to sustain such expectations. the federal government's funding capacity has significantly diminished as the cost of science has radically increased. the defense budget was $ billion while the nih budget was $ billion. the covid crisis has clearly amplified that the greatest risk to american life is not war, but disease. funding is needed at all levels; however, early career researchers should be particularly supported as the consistent trend of shifting funding away from younger researchers has no end in sight (daniels, ) . ensuring a durable future for academic science post-covid recovery from the immediate covid crisis necessitates a multi-pronged approach including fiscal and non-fiscal strategies to help graduate students, postdoctoral fellows, and early and later career faculty. this pandemic has particularly impacted senior postdoctoral fellows seeking academic faculty positions and early career faculty seeking to establish themselves as independent investigators. special consideration for these early career researchers is key to overcoming the crisis and strengthening the foundations of academic science. our action plan proposed below is not an exhaustive list of all possible recommendations for supporting scientists, nor is it inclusive of every academic scientist's specific circumstance. not all of our suggestions are applicable at every university or institution, as each will have its own unique set of challenges. we acknowledge that monetary support will be limited due to the deteriorating economic situation and drastic loss of revenue from clinical operations for most medical campuses. while the immediate goal of the recommendations is to provide support for scientists from funding agencies, universities, departments, and the public following covid , this support also provides solutions to the three major challenges. solutions to these systemic issues (i.e., excess does not equal excellence, diversification leads to discovery, or funding agencies) are interwoven across the structure of academic science, allowing us to comprehensively tackle these issues at all levels. plans for recovery from the covid pandemic must ensure as much continuity as possible in research while improving upon existing infrastructures in order to provide a more inclusive, cohesive, and efficient future for the next generation of independent scientists. the resiliency of research is dependent upon the support of funding agencies. like the broader scientific community, funding agencies will need to adapt their strategies and structure to fit the changing times. simplification of grant application processes, including fewer supplemental documentations and more implementation of letter-of-intent formats prior to full proposals, could increase efficiency for both the funding agency and researcher. lab closures will undoubtedly create a void in the preliminary data that are necessary to obtain most awards. early career researchers who had less time to acquire these data prior to lab shutdowns will be the most affected. funding agencies could introduce policies and programs targeted at early investigators that require fewer preliminary data (similar to the national institute of mental health [nimh] brain research through advancing innovative neurotechnologies [brain] initiative r or the dp ), reducing the excess in data required for most grants. grants submitted by graduate students, postdoctoral fellows, and early career faculty who do not have sufficient preliminary data per current standards should be given special consideration. currently, many of the new funding opportunities by funding agencies, such as the nih, are geared toward supplements to existing grants or covid-related research. as there will likely be restrictions or reductions to new funding opportunities in the coming years due to fiscal shortages, faculty with existing grants might help early career faculty by including them in their supplemental applications. including early career faculty will also foster collaboration and resource sharing, both of which will be vital during this time (excess does not equal excellence and rethink the fundamentals of funding). extension of deadlines, timelines, and funding numerous funding agencies have already implemented deadline extensions, but deadlines must be further extended for the duration of lab disruptions. it is also imperative that funding agencies extend early investigator status for grant applications and implement no-cost extensions for currently held grants. additional bridge funding programs may be especially important for faculty who are between projects or aiming to switch areas of study following the covid crisis. extensions for tenure: faculty most universities have added one-year extensions to the tenure tracks of early career researchers, but sliding extensions may better support the success of vulnerable academics. many early career investigators may request extensions during lab closures, but they should also have the ability to go up for tenure early if the opportunity arises. ensuring the promotion and advancement of marginalized groups such as women, who make up < % of stem faculty, is even more imperative post-covid . covid -initiated resetting of expectations for the publishing, teaching, mentorship, and service requirements for tenure may not only help minimize the excesses innate to the current tenure structure, but also may help foster environments that can acknowledge implicit biases and keep marginalized groups from disproportionately leaving stem fields. tenure expectations for the next generation of early career researchers may need to account for increased variability between faculty that is exacerbated by the covid crisis and allow for more flexibility in the process. this crisis has amplified how the antiquated one-size-fits-all guidelines only encourage the disenfranchisement of women and racial or ethnic minorities (diversification leads to discovery and excess does not equal excellence). the current crisis will have a dramatic trickle-down effect, and numerous hiring freezes are already in place. mechanisms to allow postdoctoral fellows or graduate students in their final year to continue in their current positions should be enacted, if necessary, and if labs or universities are able to provide fiscal support. current closures are also disrupting the ability of many graduate students to complete their rotations. universities could extend the timeline for rotations and potentially cover graduate students' stipends. trainees, particularly postdoctoral fellows, may ll cell , june , please cite this article in press as: gibson et al., how support of early career researchers can reset science in the post-covid world, cell ( ), https://doi.org/ . /j.cell. . . have limited ability to extend their period of training due to visa restrictions. universities should coordinate with federal agencies to pursue strategies aimed at extending visa expiration timelines, allowing trainees to complete work that was delayed due to the covid crisis. these mechanisms are needed to assure that we do not lose an entire generation of scientists following the coronavirus crisis. curtailment of applicable hiring freezes many universities have implemented hiring freezes for faculty and staff for the remainder of the year or beyond. universities should not limit the ability of early career faculty to hire postdoctoral fellows and staff, however. restricting early career faculty from hiring technical assistance and lab managers will stymie their ability to generate preliminary data, which will consequently limit grant and paper submissions and delay career advancement. even a short hiring freeze could have devastating effects on the ability of early career faculty labs to succeed. allowing early career faculty to continue hiring will also help to ease the bottleneck of graduate students looking for postdoctoral or research scientist positions within the next few years. hiring freezes at any level will disproportionately affect early career individuals and oversaturate the market with qualified candidates. permitting ongoing interviews for faculty positions, even if the official hire date is postponed, could alleviate stress on the postdoc population and expedite the hiring process when hiring freezes are lifted. the faculty search process serves as a valuable feedback mechanism for postdoctoral fellows that sometimes has an impact on career path. halting all hiring and all faculty searches may drive talented postdocs, especially women and members of ethnic or racial minorities, out of academia (diversification leads to discovery). although universities may curtail spending from institutional funds, special consideration should be given to new and early career faculty. early career faculty must retain access to their startup packages during this time. institutional funds should be released for salary support for early career faculty and for all staff, students, and trainees in their labs. if startup funds are set to expire, the expiration date should be extended. new faculty should be given the funds needed to establish their labs once research activities resume (rethink the fundamentals of funding). the economic toll caused by shelter-inplace will undoubtedly be significant, including the reduction in funding through endowments and charitable giving. we fully acknowledge that monetary supplementation may be difficult for universities following the covid crisis. any combination of fiscal supplementation with other mechanisms of non-fiscal support should be considered. universities might implement new or expanded fellowships for postdocs and graduate students, add to existing startup packages for faculty, assist with the purchasing of equipment or expand shared equipment funding, or create subsidies or joint ventures with federal programs similar to unemployment or re-deployment programs. universities might supplement pay or provide reimbursement for staff, postdoc, and graduate student salaries during the duration of academic closures. many universities have per diem policies that differ based on funding source, with reduced per diem costs associated with federal grants. early career faculty without federal funding have per diem costs double that of other labs. universities could implement mechanisms to reduce or supplement animal costs that will be accrued during lab closures and when labs reopen and expand their animal colonies (rethink the fundamentals of funding). onsite daycare facilities support postdoctoral fellows and faculty with young children. these family care centers are critical to narrowing the gap and slow the attrition of women and parents in science. universities could work with early childhood education programs to establish or expand daycare and preschool programs, providing free or subsidized childcare for faculty and teaching opportunities for early education majors. universities might also reach out to current or retired teachers seeking supplemental income (diversification leads to discovery). universities should encourage and enable graduate students and postdocs to use this time to learn new computational skills in anticipation of reductions in ability to do work at the bench. many universityoffered computational courses were over-committed during lab closures due to a significant increase in enrollment requests. universities should make a concerted effort to increase bandwidth and capacity for computational courses. many free online resources are also available to supplement the acquisition of coding skills. administrative and teaching expectations should be reevaluated during university closures. departments should reassess administrative and teaching loads, especially for early career faculty whose promotions are contingent upon teaching requirements. this is especially important, since female scientists generally have increased teaching loads and more advisory expectations than male scientists (gibney, ) , which could disproportionately delay scientific recovery of female scientists from covid closures (diversification leads to discovery). the covid crisis and subsequent lab closures will take an incredible toll on mental health. early career faculty who have yet to establish themselves or their research independently and postdocs whose future job prospects are now significantly limited will be especially impacted by prolonged lab shutdowns. department chairs, division leaders, and mentors should do their best to check in with early career faculty and postdocs during this time. mentoring will be key both during and after this crisis. establishing scheduled virtual meetings during social distancing and in-person meetings after labs are reopened could help alleviate some mental stress. university mental health resources are also available for anyone who needs support. as students generally contact female faculty about mental health issues more frequently than male faculty (bennett, ) , equal encouragement of mentorship from all faculty is essential to not overburdening women faculty during this time (diversification leads to discovery). mentoring graduate students throughout lab closures and after reopening should be strongly encouraged. those conducting experiments will be most affected by lab closures, and this should be explicitly acknowledged by faculty and mentors. universities must assure graduate students that graduate programs will be stabilized and that admittance will not be decreased. for many faculty, graduate students are the major workforce of the lab. to ensure that faculty can successfully build and sustain a lab, continued ability to attract graduate students is necessary. this is especially important for new investigators, as getting postdoctoral fellows can be more challenging for newer faculty. once labs are reopened, pairing early career faculty with a later career faculty mentor of an established lab could facilitate more effective research programs and allow for resource sharing. later career faculty could be incentivized to help early career researchers through reductions in teaching or administrative loads, supplementations to animal care costs, core facility usages, or other means of reimbursement and/or subsidizations. investment of later career faculty in the success of early career faculty will help to ensure stability and success in the younger generation of independent researchers. faculty who have clinical responsibilities also necessitate special consideration during this time, especially if they are on the front lines. these individuals will not only lose productivity due to lab closures and curtailment of patient enrollment in clinical trials, they will also have the extra physical and mental stressors of working in the hospital during a crisis. establishment of protocols to aid clinician-scientists is imperative to ensuring their important contributions to science. just as senior faculty mentoring will be critical for junior faculty and graduating postdocs to successfully transition to a post-covid era in the basic sciences, this type of mentorship protocol may be even more critical for clinician-scientists, many of whom do not have doctorates beyond the medical degree. make science a national priority the current crisis has brought the importance of science and research to the forefront of public life. not only is science critical for public health decision-making, but a sustained investment in research better positions political leaders to efficiently deploy testing and therapeutic solutions. capitalizing this momentum is crucial to engaging the public in science and science funding. providing additional funding sources focused on conveying science to the greater public and stimulating interest in science through educational outreach is critical. exploiting technology and social media to bring science and research directly to the public will be vital in the post-covid world. such technology might include mechanisms to allow private citizens to directly invest in science and scientists (else, ; miller, ) , including simplified website-based donation platforms or inclusion on election ballots. this is necessary for establishing new funding sources for scientists, potentially supplementing the dearth of funding for early career researchers at federal funding agencies (rethink the fundamentals of funding). the covid crisis has revealed a lack of public understanding about how science is funded, conducted, and reported. the current administration's belief that the nih is ''giving away $ billion a year'' should be cause for concern (deyoung et al., ). much of the mistrust evident between the scientific establishment and the general population is rooted in lack of transparency and community figure . the covid crisis has magnified the systemic issues plaguing academic research. these include the often stifling excess requirements in publication, tenure, and grant processes; the reliance on funding from national agencies that is catered towards senior level researchers; and the lack of diversity in academic research due to the attrition of women and racial or ethnic minorities during early career stages. ll involvement in science. taking scientists out of the ''ivory tower'' and increasing accessibility through technology may help to assuage the mistrust that hinders our preparedness in times of crisis. people cannot support what they do not understand. removing excess requirements in publishing, grantsmanship, and tenure expectations could have the added benefit of creating more time for scientists to interact in the public domain. scientists must work on building the trust that is imperative to success as a community, and early career scientists are primed to help pave this new future (excess does not equal excellence). beyond the immediate challenges of returning to laboratories and research careers, the covid crisis has exposed some of the underlying weaknesses and problems that permeate the current scientific enterprise (figure ). for example, editors are asking reviewers to not request more experiments unless absolutely necessary to validate the core claims of a manuscript during the review process. most are applauding this effort to minimize excess and calling for its continued implementation even after scientists are able to get back to the bench. all institutions, funding agencies, departments, and members of the scientific community should speak openly and honestly about the difficulties faced during the current situation. early career researchers should be involved in the decision-making processes, as they represent the future of science and academic leadership. the covid crisis has provided us with the unique opportunity to reflect upon the present norms and enact change through fiscal and non-fiscal strategies. our hope is that this pandemic will allow us to chart a new course for science, both academically and socially, and to begin to address the core challenges of research, with a special focus on supporting the next generation of independent scientists. student perceptions of and expectations for male and female instructors: evidence relating to the question of gender bias in teaching evaluation a generation at risk: young investigators and the future of the biomedical workforce americans at world health organization transmitted real-time information about coronavirus to trump administration. the washington post, available from crowdfunding research flips science's traditional reward model teaching load could put female scientists at career disadvantage. nature in the wake of covid- , academia needs new solutions to ensure gender equity the best platforms for crowdfunding science research. the balance: small business gender equality in academia: bad news from the trenches, and some possible solutions dr. roberts serves as editor-in-chief of books for the american psychiatric association publishing division and as editor-in-chief of the journal academic medicine. unrelated to this publication, dr. roberts serves as an advisor for the bucksbaum institute of the university of chicago pritzker school of medicine and owns the small business terra nova learning systems. key: cord- - kl jz authors: cattaneo, roberto; schmid, anita; eschle, daniel; baczko, knut; ter meulen, volker; billeter, martin a. title: biased hypermutation and other genetic changes in defective measles viruses in human brain infections date: - - journal: cell doi: . / - ( ) - sha: doc_id: cord_uid: kl jz abstract we assessed the alterations of viral gene expression occurring during persistent infections by cloning full-length transcripts of measles virus (mv) genes from brain autopsies of two subacute sclerosing panencephalitis patients and one measles inclusion body encephalitis (mibe) patient. the suquence of these mv genes revealed that, most likely, almost % of the nucleotides were mutated during persistence, and % of these differences resulted in amino acid changes. one of these nucleotide substitutions and one deletion resulted in alteration of the reading frames of two fusion genes, as confirmed by in vitro translation of synthetic mrnas. one cluster of mutations was exceptional; in the matrix gene of the mibe case, % of the u residues were changed to c, which might result from a highly biased copying event exclusively affecting this gene. we propose that the cluster of mutations in the mibe case, and other combinations of mutations in other cases, favored propagation of mv infections in brain cells by conferring a selective advantage to the mutated genomes. subacute sclerosing panencephalitis (sspe) is among the most thoroughly studied persistent viral infections of the human central nervous system and serves as a model for analysis of the development of persistent viral infections known or suspected to cause several human syndromes, including multiple sclerosis kristensson and norrby, ; dowling et al., ) . sspe generally develops to years after acute measles, starting with subtle signs of intellectual and psychological dysfunctions, continuing with sensory and motor function deterioration and progressive cerebral degeneration, and leading to death after months or years (ter . the measles inclusion body encephalitis (mibe) clinical and virological manifestations are similar to those of sspe, but the incubation time of mibe can be shorter (roos et al., ; ohuchi et al., ; . moreover, mibe is found in immunosuppressed patients, whereas sspe patients mount high antibody titers against all measles virus (mv) proteins except matrix (m protein; hall et al., ) . m protein is responsible for viral assembly, and it was postulated that silencing of m protein expression could account for lack of viral budding and favor persistence (hall et al., ) . indeed initially, m protein could not be detected in brain tissue of sspe patients (hall and choppin, ) , but in recent studies using monoclonal antibodies, m protein was found in diseased human brains where the viral envelope proteins fusion (f) and hemagglutinin (h) could not be detected (norrby et al., ; baczko et al., ) . thus, defective m protein expression might not be the only viral determinant correlating with persistence. the study of the molecular basis for defective mv gene expression in sspe concentrated initially on the cellassociated, defective mvs that can be occasionally obtained by cocultivation of brain cells of sspe patients with stable cell lines (wechsler and fields, ; hall et al., ; carter et al., ; sheppard et al., ) . however, the relevance of these observations for human brain infections remains to be established, since these viruses might not be truly representative of the viruses present in infected brains (norrby et al., ) . to assess the alterations of viral gene expression characteristic for mv persistence in diseased human brains, it appeared desirable to clone mv genes directly from brain tissue. until now, this has been accomplished only for one m gene in one sspe case, where it turned out that among many other alterations a point mutation introduced a stop codon at position of the m reading frame (cattaneo et al., ) . in the present work, using a procedure allowing selective fulllength cdna cloning of mv rnas (schmid et al., ) , and starting with only . pg of polyadenylated brain rna, we achieved cloning of at least one transcript of all the viral genes, except the large polymerase gene, from two sspe cases and one mibe case. from examination of the three sets of nucleocapsid (n), phospho (p), m, f, and h genes, we determined that, on average, the mv genomes recovered from a single brain differ from each other in - of their , bases. we also estimated that in all three cases, - mutations have been fixed during the course of persistence. about % of these mutations resulted in amino acid substitutions; in one m and two f genes, reading frames were grossly or slightly changed, respectively. remarkably, in the m gene of the mibe case, but in no other gene, a cluster of transitions converted % of the u residues to c. we have previously characterized mv gene expression in brain autopsies of two cases of sspe (a and b) and one case of mibe (c) by immunofluorescence analysis of brain sections with monoclonal antibodies and immunoprecipitation of mv proteins translated in vitro from brain rna (baczko et al., . the n and p proteins, in-volved in viral replication, were detected in all three cases. the m protein, responsible for viral assembly, was detected only in the brain of case a, the f protein only in case b, and the other envelope protein h only in case c. since expression of the m, f, and h mrnas is diminished in diseased brains (cattaneo et al., b) , it is conceivable that the failure to detect the corresponding proteins could simply be because of low mrna levels. to produce sufficient arnounts of these rnas, we cloned full-length cdnas of these genes from the a, b, and c brains and from the reference mv edmonston strain in an in vitro rna expression vector (experimental procedures). synthetic transcripts were then used to direct protein synthesis in a rabbit reticulocyte lysate. figure is an analysis of the proteins produced from the n, m, f, and h synthetic transcripts of cases a, b, and c, as compared with the proteins produced from the synthetic transcripts of the edmonston (e) strain. the m gene of case c produced only low levels of proteins considerably smaller than the edmonston m protein (about kd, kd, and kd instead of kd). in contrast, the other proteins that had not been detected in brain autopsy materials (the f and h proteins of case a, the m and h proteins of case b, and the f protein of case c) were produced in amounts comparable to the edmonston proteins and had approximately the expected size. this indicated that the reading frames of these genes were intact or only slightly modified. (note that single amino acid substitutions could substantially change the mobility of a protein [noel et al., .) in the case of the f genes, differences in migration were greater than in other genes, amounting to apparently kd between the most rapidly migrating protein ( figure , fusion gene, case b) and the one migrating most slowly ( figure , fusion gene, case c). this is at the upper limit of the differences in migration observed in the proteins of defective sspe viruses (hall et al., ) . to ascertain whether the reading frames of these genes were intact, we sequenced the ends of each clone used for expression analysis; in all genes except the m gene of case c (see below) and the f genes of cases a and b (figure a) , the signals for initiation and termination of protein synthesis were intact. in the f gene of case a, deletion of one nucleotide at position resulted in a shift in the reading frame causing substitution of the last amino acids of the edmonston f protein by other residues (figure a , nucleotide and amino acid positions as in the convention of richardson et al. [ ] ; this deletion was confirmed in two clones). this explains the higher electrophoretic mobility of this protein as noted above. the apparent molecular weight of the f protein of case b was even lower than that of case a (figure , fusion gene, cases a and b). this was due to the introduction of a new stop codon (uaa) by a c to u mutation at position , resulting in the expression of an f protein shortened by amino acids ( figure a) . thus, two of the three f genes examined expressed an f protein with a mutated c terminus. the active f protein of paramyxoviruses is liberated from its inactive precursor by endoproteolytic cleavage after a stretch of basic amino acids ( figure , center), giv- about ng of synthetic mv transcripts was translated in a rabbit reticulocyte lysate (promega, madison, wi) in the presence of %slabeled methionine (in vivo cell labeling grade, more than cilmmol, amersham international, england). equal amounts of the products of these reactions were loaded onto a protein gel (laemmli, ) which was soaked in sodium salicylate, dried, and autoradiographed (chamberlain, ) . the rnas translated were (from left to right): brome mosaic virus rna (bmv), yielding marker proteins of kd, kd, kd, and kd; no rna (neg.): the transcripts of the genes and cases are indicated on the top. the apparent sizes of the n, m, f, and h protein products of the edmonston strains are the expected ones for proteins translated in the system, kd, kd, kd, and kd, respectively (hasel et al., ) . proteins of higher mobility were detected in addition to the full-size n and h proteins, as observed in a previously study (hasel et al., ) . ing rise to a unique hydrophobic domain (figure , right; varsanyi et al., ; richardson et al., ; glickman et al., ) . it was demonstrated recently for the paramyxoviruses, as well as for influenza a virus and human immunodeficiency virus, that full expression of viral infectivity depends on the efficiency of proteolytic cleavage at this site (webster and rott, ; glickman et al., ; mccune et al., ) . to identify possible sequence alterations causing inefficient f protein cleavage and thus leading to the loss of lytic function typical for mv infections of human brains, we analyzed the region of the f cdnas coding for the cleavage/activation site. however, this site was completely conserved in all three brain-derived f genes ( figure , bottom). to define the mutations introduced during persistence, ideally the sequences of the lytic viruses that infected the three children investigated should be available for comparison. it is however impossible to retrace these viruses, and comparison with the edmonston strain sequence would not be valid since this virus has undergone numerous passages in chick embryos and cultured cells during the process of attenuation (enders et al., varsanyi et al. ( ) and glickman et al. ( ) . for details see text. ). to overcome this limitation as well as possible, we compared our data with a consensus m gene sequence, indicated in figure as pre, which comprises the nucleotides represented most often in nine sequences determined experimentally (three sequences of lytic mvs and six of persistent mvs), as detailed in the legend to figure . we will be calling the deviations from this consensus "mutations:' although this term is not actually accurate. as mentioned above, the m gene of case c was of particular interest because its protein product had an apparent molecular weight considerably lower than expected ( figure , matrix gene, case c). from the sequence analysis presented in figure , it is immediately evident that mutations in this gene are more abundant than in other m genes and that u to c transitions account for a large majority of mutations. in fact, of the u residues encoded in the pre sequence (that is about %) are changed to c in the case c sequence. this phenomenon resulted in the alteration of the m protein initiation signal ( figure , positions and - ). moreover, in clone pcm , used for production of the synthetic rnas, a nucleotide deletion at position created a frameshift resulting in the introduction of a stop codon (tag at position - ). taken together, these two events should lead to the production of an altered m protein with a molecular mass of about kd. indeed, a major product of kd is detected ( figure , matrix gene, case c). it is also of interest that several minor proteins were produced by the synthetic pcm transcripts. this was probably the result of initiation of translation on downstream aug and upstream non-aug codons, characterizing translation of genes possessing a "weak" aug, like the one at position - of the m gene of case c (kozak et al., ; curran and kolakofsky, ) . the other four sequenced clones of this m gene did not contain nucleotide deletions, and their major protein products migrated approximately at the position of the m proteins of the other cases (data not shown). this was expected since the loss of amino acids from the amino terminus should be compensated by the gain of at the carboxyl terminus, the gain being due to the substitution of the termination signals at position - by a new signal at position - (both underlined in figure ). given the very high level of u to c transitions detected in the m gene of case c, we predicted that if these mutations had progressively accumulated during persistence, other genes of case c would also have accumulated similar transitions. to test this hypothesis, we sequenced the complete n gene and one third to one half of the f? f, and h genes. we also analyzed the corresponding genes of the a and b cases and compared them with consensus sequences as defined above ( figure indicates the genomic areas sequenced). as shown in table , in the m gene of case c the level of u to c transitions exceeded by a factor of at least all other kinds of mutations, whereas, surprisingly, the levels of u to c transitions in all other genes of case c were comparable to the levels of the other mutations. to further investigate the distributions of u to c mutations in the mv genome, we also sequenced clones covering the whole m and part of the flanking genes (pcm and pcm , legend to figure ). from the graphical representation of these analyses ( figure ) it is evident that, whereas in the n, p and h genes two or fewer transitions have been introduced per group of us, in the m gene between and changes have occurred. interestingly, the switch between high and low levels of u to c transition was abrupt at the p-m gene junction but more gradual at the m-f gene junction; in the first us of the f gene, distributed over not less than nucleotides, five u to c mutations were detected, whereas in the following groups of us, first three and then two or fewer mutations per group were detected. thus, the limits of the genomic regions with high levels of u to c mutations roughly coincide with the limits of the m gene. from table , it is also evident that in the m gene of case c, the level of a to g mutations, corresponding to u to c mutations in the other mv genomic strand, was not enhanced. this indicates that the transitions must have been introduced exclusively in one strand, an event that could arise theoretically either by sequential, strandspecific cycles of localized and biased mutations or, more plausibly, by a single hypermutation event. by sequencing five sibling but not identical clones of the m gene of case c, we also noted that in the few positions that were variable between sibling clones (small letters in sequence c of figure ) we could not detect any u to c mutation, that is of the u to c transitions were conserved in all five m cdnas of case c. we thus conclude that a single event, rather than a continuous, progressive introduction of u to c transitions, must account for the amazingly high level of u replacements. previous studies indicate that during persistence, mutations are continuously introduced and fixed in mv genomes (cattaneo et al., ). this phenomenon is the t residues in these cdnas correspond to us in the mv transcripts. sequence e is from the edmonston strain sequence h from the street virus hu (curran and rima, ) , sequence q from the strain cam (this paper, see liebert and ter meulen ( for a description of this strain), sequence k from sspe case k (cattaneo et al., ) sequence i from sspe cell line ip- -ca and sequence m from sspe cell line mf (this paper, see cattaneo et al. [ a] for a description of case mf). the pae sequence is a consensus comprising the nucleotides represented most often in the nine sequences obtained experimentally. the positions differing from the pre sequence are indicated with capital letters (positions diverging in all clones of the same case), or small letters (positions diverging only in some sibling clones). the translation start and stop codons are underlined, as are the mutations leading to amino acid changes. an asterisk in pre indicates a variable position for which no consensus could be defined. position was variable not only within cases but also within sibling clones: it corresponded to g or a in cases i and a and to g or c in case m. a nucleotide deletion at position - in case k is indicated with two deltas. m clones resulting from the elongation of non-m ' primers hybridizing semispecifically to the gc-rich ' nontranslated region of the m gene were obtained and completely sequenced: clones pam , pam , pam , pbm , pbm. , pbm , pcm , pcm , and pcm coded for m genes, respectively, , , qo. , , , , and nucleotides shorter at their ' end. clones pcm , encoded the whole m gene and additional nucleotides of the neighboring f gene, and pcm encoded the whole m gene and part of the n and p genes. about % of the positions could not be defined because of "strong stops" in the sequencing reactions, and these positions were considered as showing no variation from the pae sequence. the genbank accession number of this sequence is j . number of mutations to c in u residues the pre sequence of the m gene is shown in figure ; the other pre sequences were constructed using the a, , and c genes and the genes of case ip- -ca and of the edmonston strain (cattaneo et al., , and references therein). most likely based on one hand on the low fidelity of rna to assess the variability in strains of lytic viruses, and replication (domingo et al., ; for review see steinto estimate the number of mutations introduced during hauer and holland, ) and on the other hand on the persistence, we counted the differences of lytic and perlow selective pressure exerted on viral genomes in nonsistent viruses from a consensus as defined above. the lytic infections (holland et al., ; rowlands et al., lefthand panel of figure represents the mutations from ). in an attempt to quantify the level of internal varithe pre consensus ( figure ) detected in the m coding ability of mv genomes in the human brain, we compared regions of the three lytic viruses, edmonston (e), hu (h), the sequences of three overlapping m clones of cases a and cam (q), and of the six persistent viruses, k, i, m, a, and b and of five overlapping m clones of case c. internal b, and c. in genes from lytic infections, - differences variability was . % for the three m clones of case a (six from the consensus were detected, whereas in genes differences over comparable nucleotide pairs), . % from sspe persistent infections, - differences were for case b (six differences over pairs) and . % for monitored ( differences in the mibe case c). thus, we case c (seven differences over , pairs). most of estimated that two to three times more mutations accumuthese changes are probably due to the mv polymerase lated in the m coding regions of viruses implicated in peritself rather than to the reverse transcriptase used for clon-sistent infections. from table , it is clear that this holds ing, since clones obtained with the same technique from true for all the others genes examined, with the exception another rna source differed in less than . % of their of the n gene (legend to table , note "). if we assume positions (k. baczko, unpublished data). it should be that the lytic viruses that initiated the three persistent innoted that the variability of the mv genomes in case fections had accumulated a number of differences from c was lower than in the other two cases, which could be the pre sequence similar to that of the three lytic strains explained by the shorter time of viral persistence in the studied, we can extrapolate that - % of the mutations case of mibe. this result reinforces the suggestion that scored in cases a, b, and c accumulated during the perat the final stage of the mibe infection the mv polymerase sistent phase of these infections (this may be an undereswas at least as precise as in the two sspe infections. a timation, see the end of this section). knowing that in the variability of . % will result in - differences be-three persistent infections mutations from the contween any two mv genomes with a length of , sensus sequence have been detected over , nucleonucleotides, which is a high number even for an rnavirus tides compared (calculated from table , first column), (steinhauer and holland, ; cattaneo et al., ) . and assuming that %- % of these mutations have been fixed during persistence, we can also extrapolate that - mutations have been introduced in the mv genome ( , nucleotides) during persistence. using the cdnas described here, it should be possible in principle to establish complementation assays to test the effect of single mutations on gene function, and thus to assess if the point mutations introduced during persistence resulted in slight alterations, gross distortions, or disruption of viral protein functions. examination of the mv proteins found in brain cells of different sspe patients has shown examples of restricted expression of the f and h proteins, as well as of the m protein (norrby et al., ; baczko et al., ) . in contrast, n and p proteins, the two proteins required together with the polymerase for mv transcription and replication, were always detected. the most straightforward explanation for these observations is that the constraints imposed in persistent infections on the m, f, and h genes are relaxed, since they encode viral functions generally presumed to be dispensable for replication (rosenblatt et al., ) . if this was the case, we would expect fixation of more mutations causing amino acid changes (replacement site mutations) in the viral envelope protein genes than in the n and p genes. as shown in table (first column), the levels of mutations accumulated in all genes were fairly similar (about %), except for the m gene of the mibe case. in the f and h coding regions, respectively, /o and % of the mutations resulted in amino acid changes, a similar for this computation, two variable positions in the pre sequence (marked with an ' in figure ) were not considered. a total differences in the coding regions of the a, b: and c cases: n gene, . %; p gene, . %; m gene, . % ( . % if case c is not considered); f gene, . %; and h gene, . %. b total difference in all the noncoding regions of the three cases: . %. c the sequence of the edmonston n gene diverges from the consensus in about twice as many positions as the sequences of the other edmonston genes. it is remarkable that of the differences were concentrated in the last bases. d in these genes, bases of the ' noncoding region and of the ' noncoding region could not be determined. b the relatively low incidence of mutations in the untranslated region of the m gene of case c is due to the scarcity of us. (in fact somewhat lower) percentage to the n and p coding regions ( % and /o, calculated from the results presented in table , second column). it should also be noted that, even without considering case c, the m genes had the highest percentage of replacement site mutations, that is %. the low number of replacement site mutations found in the h genes was reflected by the identical migration behavior of all h proteins (figure , hemagglutinin genes), and the high number of amino acid changes found in the m genes by the relatively large differences in migration of the m proteins ( figure , matrix genes) . these numbers suggest that the selective pressure acting on the genes directly involved in viral replication was not very different from that acting on the viral envelope genes during persistence. the fact that the differences from the consensus sequence were about twice as frequent in the untranslated regions of all the genes compared with the respective coding regions (table , third column and notes a and b, reinforces the suggestion that selective pressures to preserve protein functions remained in effect. on the basis of the sequence comparisons presented in figure , the three lytic viruses fall in a separate subclass from the six persistent viruses. if an m gene consensus sequence is constructed by considering only the lytic viruses, each of the "lytic" m coding regions differs only in - positions from the "lytic consensus: in contrast with the persistent viruses differing in - positions ( differences in case c; right panel of figure ). this suggests that the number of mutations introduced during persistence, which was estimated above to /o- % of the total differences from the pre sequence, might in fact be as high as %- /o. the definition of a "lytic consensus" different from the pre sequence implies that lytic viruses can be distinguished from persistent viruses on the basis of their sequences at characteristic sites. this observation, if confirmed on a larger sample of genes, might have important practical applications: diagnostic differences might be applied for tracing the source of viruses causing measles epidemics or persistent infections. moreover, vaccine strains could be selected on the basis of their sequences, and finally, safer vaccines possessing all genomic characteristics defined as favorable in different strains could be engineered. previously, the occurrence of viral mutations in sspe cases has been documented (cattaneo et al., . however, it was never clarified whether certain mutations constitute a prerequisite for the development of the disease. mutations might simply be a corollary phenomenon, to be explained by the release of selective pressure exerted on viral genomes that need only replicate and spread from cell to cell but that do not have to provide all the functions necessary for the assembly of infectious virus particles. although the present study still does not directly establish a causal relationship between mutations and disease, two experimental findings presented here strongly support this hypothesis: first, the mechanism of m gene function inactivation by hypermutation in the mibe case; and second, the very extensive and apparently directed drift separating all three persisting mv genomes analyzed from the infecting viruses. defective expression of m protein has been previously revealed in sspe cases (wechsler and fields, ; hall et al., ; carter et al., ) . in particular, both complete absence of m protein (hall and choppin, ) or presence of nonfunctional (i.e., unstable) m protein have been reported (sheppard et al., ) . in the present study, both of these possibilities were found in the three cases analyzed. in case b, we could monitor the efficient in vitro production of an m protein of approximately correct size from synthetic transcripts of a cdna clone, in spite of the fact that such a protein could not be detected in the brain autopsy of this patient (baczko et al., ) an observation that can be explained by postulating rapid m protein degradation in vivo. in case c, no m protein could be produced: the proteins synthesized inefficiently in vitro from the synthetic m mrnas had grossly altered termini and dozens of mutated amino acids. the particular interest of this case resides in the mechanism of m inactivation; the analysis of the m and four other genes of this case indicated that a single, biased hypermutation event was most likely responsible for the selective silencing of m gene function. since a lytic virus with intact m function must have been at the origin of the mibe infection, we must conclude that the hypermutation event did not severly affect the efficiency of this genome to replicate. instead, this event must have conferred a selective advantage for the spread of the mutated genome in the brain, because only mutated genomes were detected at death. thus, for case c, our study provides a direct correlation between m function silencing and mutational change, which in this case came about by a probably unique and grossly distorting event. in other words, it seems very likely that the propagation of this lethal infection in the human brain originated from a single genomic clone of mv. nevertheless, m function silencing might not be obligatory for persistence, as suggested by the detection of m proteins in some sspe cases (norrby et al., ; baczko et al., ) including case a reported here, where an m protein of approximately the correct size was detected both in vitro and in vivo. however, we do not know whether these m proteins are functionally competent. on the other hand, in case a, the carboxyl terminus of the f protein has been structurally altered. a similar alteration of the f protein, mediated by a different mutation, was identified in case b. this indicates that f protein function might be slightly or severely impeded in both cases a and b. in summary, gross alterations have been found so far only in m proteins, less severe modifications in f proteins. it remains to be seen whether all these changes, and/or more subtle changes in these and other viral proteins, might not also contribute to propagation of mv persistent infections in brain cells. the second argument in favor of the view that some mu-tations are instrumental for the development of brain infections is provided by the features of the populations of viral genomes present in brains. in rna virus populations maintained at constant selective pressures, genomic variability is high, but a stable consensus sequence is established that usually changes minimally (domingo et al., ; holland et al., ) . in contrast, when selective pressures change, viral rna genomes do not maintain the consensus, but rapidly evolve by selection of the fittest (holland et al., ; rowlands et al., ) . in cases a and b, the populations of viral genomes show an internal variability about ten times lower than the estimated number of changes acquired during persistence ( - variable positions versus - acquired changes). in case c, the internal variability is even lower and the number of acquired changes is of the same order as in the other cases if the changes introduced by the hypermutational event are disregarded. this strongly supports the argument that mutated genomes must have been selected one or more times, conferring a direction to the evolution of the system. thus, viral mutations might indeed favor viral persistence if, instead of compromizing propagation of infection, they promote it in the particular environment of brain cells. obviously, such evolved viral genomes can never become manifest as new viral strains because they are unable to propagate beyond the life span of their host. the fact that sspe and mibe arise so rarely might indicate that combinations of mutations favoring propagation of persistent infections are infrequent. moreover, it may well be that persistent infections can be established only in cases where some host defense mechanisms fail (fujinami and oldstone, ; carrigan and kabacoff, ) . this is also suggested by the fact that mibe, a complication typical for immunosuppressed individuals, arises more frequently in such patients than sspe in untreated individuals. similar considerations might apply for other viral infections known or suspected to be the cause of several human syndromes (wolinsky and johnson, ) . a biased rna polymerase? we are not aware of any other documented case where genetic information involving an entire gene has been distorted so drastically, most likely in a single event. the recently described extensive editing of kinetoplastid mitochondrial transcripts by uridine addition and deletion results in a spectacular modification of the mrnas, but not in alteration of the gene (shaw et al., ; feagin et al., ) . it must be mentioned, however, that a similar exclusive mutation of one type of nucloetide to another has been described, albeit in a much more restricted region, in the related vesicular stomatitis virus (vsv). in that instance, analogous a to g transitions ( of positions considered) were detected in a short region ( nucleotides, intrinsically very rich in a residues) of a defective interfering (di) genome (o'hara et al., ; note that our u to c mutations, as written in plus strand polarity, might have been introduced in the minus strand genome as a to g transitions). the question remains as to how the mutational cluster in the mv genome of the mibe case could have arisen. in principle, mutations of this kind could be introduced either by chemical mutagens or by imprecise polymerases. the mibe patient had been subjected to a large variety of immunosuppressive and cytostatic drugs, including potential mutagens (roos et al., ) . however, the level of mutations observed here is much higher than the level of mutations induced by any chemical mutagen (singer and kusmierek, ) . even assuming that a chemical mutagen in a living cell could induce mutations leading to the replacement of % of the u residues, it would be very difficult to explain how these mutations could have selectively affected a defined region of a nonsegmented genome. to account for this, homologous recombination of an mv "standard" genome with a hypermutated mv genome or an mrna that coexisted in the same cell would have to be invoked. homologous recombination involving breakage and joining between preexisting strands as with dna is not documented for rna. the apparent recombination events common in positive strand rna viruses probably take place by a copy choice mechanism, in which the viral rna polymerase switches template during rna replication (king et al., ; kirkegaard and baltimore, ; keck et al., ) . in contrast, in negative strand rna viruses, nonhomologous recombination is less common, and homologous recombination has not yet been reported (jennings et al., ; o'hara et al., ; for review see steinhauer and holland, ) . a much more plausible explanation of the observed phenomenon is that one particular part of a genome is synthesized by a biased mv rna polymerase complex nonselectively incorporating u or c residues when copying an a. two prerequisites have to be met in this model: first, biased mv polymerase complexes must occasionally occur in an infected cell; and second, biased and faithful polymerase complexes must act in succession during the synthesis of one progeny rna on one rna template. the polymerase complex of nonsegmented negative strand rna viruses is composed of the large polymerase itself and a phosphoprotein, tightly associated with each other and with the genomic (or antigenomic) ribonucleocapsids, that is rnas enwrapped with nucleocapsid protein molecules (banerjee, ) . it has been shown that the polymerases and phosphoproteins are distributed in discrete clusters on cytosolic ribonucleocapsids (portner and murti, ; portner et al., ) , and it is conceivable that these clusters correspond to polymerase complexes reorganizing during replication (or transcription), possibly by exchanging parts of their components. rna polymerase complexes giving rise to biased errors could arise because they are constituted from genetically altered subunits, because normal subunits assemble in a defective fashion, or because normal rna polymerase complexes can temporarily assume a distorted conformation. evidence for the existence of conformationally "perturbed" rna polymerase complexes, introducing either c or u residues when copying an a after a triggering error, but then returning to the normal fidelity, was obtained from a vw di genome; when rare rna molecules were analyzed in which a particular misincorporation had occurred, it was found that in a position situated two nucleotides downstream of the misincorporated nucleotide, in %- % of the molecules, a c residue was incorporated instead of a u (steinhauer and holland, ) . remarkably, all other nucleotides, including more downstream u residues, were incorporated with normal precision. the stabilization of a "perturbed" conformation of the polymerase complex could result in nucleotide transitions in short (o'hare et al., ) or longer (case c) genomic stretches of negative strand rna viruses. in an alternative version of this model, the coexistence of normal and genetically altered rna polymerases on a single template, and a relay race of several polymerase complexes during replication is postulated. in this view, the growing end of the replicating mv genome might occasionally be taken over by an entirely new strand elongation complex, or single components of the complex might be exchanged. such events could well constitute an intrinsic property of the polymerase reaction during the transcription mode of ribonucleocapsids, that is during the formation of single mrna molecules from antigenomic templates where a stop-start mechanism of the polymerase at gene junctions has been postulated (for discussion see gupta and kingsbury, ) . during replication, mode analogous exchanges might occur, either regularly at gene junctions or occasionally by mistake. it should be possible to ascertain such a patchwise mode of polymerization with in vitro transcription-replication systems. patients patients a and b (patients and in baczko et al., ) were -and io-year-old children who showed the first sspe symptoms years after primary mv infection and died and months later, respectively. patient c was a -year-old child who died of mibe months after the diagnosis of leukemia, months after clinical measles, and months after the first symptoms of neurological disease (roos et al., ) . for selective full-length cdna cloning of five mv-specific genes using specific primers (schmid et al., ) . . pg of polyadenylated brain rna prepared as described (cattaneo et al., ) were used. clones in pbluescript were identified by colony hybridization and restriction and amplified as described (cattaneo et al., ) . a large majority of the n, r f, and h clones were full-length, but some of the m clones were incomplete at their ' end (legend to figure ). another unexpected finding was that e. coii-containing plasmids in which the h gene was cloned downstream of the lac promoter of pbluescript grew reproducibly slower, reaching lower densities, and yielding only small quantities of plasmid dna. this is most likely due to a deleterious effect of h protein for e. coli; when the h insert was in the "antisense" orientation as compared with the lac promoter, normal growth occurred. in vitro transcription in vitro transcription, in the presence of the cap analog diguanosine triphosphate (g(s)ppp(sjg; pharmacia, uppsala, sweden) and minute amounts of issp]gtp to quantify synthesis, was accomplished with t polymerase according to the instructions of the supplier (genofit, geneva, switzerland). in general, about ug of transcripts were obtained from ug of template using . mm concentrations of ribo-atr ctp, and -ttp, . mm of ribo-gtp, and . mm g( ')ppp( )g. this corresponds approximately to a l- molar ratio of template to product. the products were about % full-length, as judged by gel electrophoresis. chain termination sequencing of alkali-denatured plasmid dna was performed using deoxyadenosine '[%]thiotriphosphate ( gil mmol, amersham international, england) and t dna polymerase (sequenaserm, united states biochemical corporation, cleveland, oh) basically according to the protocol of the supplier. the primers used for sequencing the m and n genes were the same as those used in cattaneo et al. ( ) . since one primer did not hybridize efficiently to the m clones of case c, it was substituted by primer (-) - (positions as in bellini et al. [ ] ). for sequencing the ends of the p f, and h clones, commercial primers (new england biolabs, beverly, ma) hybridizing with flanking plasmid sequences were used. to sequence over the large f gene ' untranslated region, the aug, and the f /fl processing site, two primers (+) - and (+) - were used (positions as in richardson et al. [ ] ). expression of defective measles virus genes in brain tissues of patients with subacute sclerosing panencephalitis restriction of measles virus gene expression in measles inclusion body encephalitis the transcription complex of vesicular stomatitis virus matrix genes of measles virus and of canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences identification of a nonproductive, cell-associated form of measles virus by its resistance to inhibition by recombinant human interferon defective translation of measles virus matrix protein in subacute sclerosing panencephalitis accumulated measles virus mutations in a case of subacute sclerosing panencephalitis: interrupted matrix protein reading frame and transcription alteration altered transcription from a defective measles virus genome derived from a diseased human brain altered ratios of measles virus transcripts in diseased human brains multiple viral mutations rather than host factors cause defective measles virus gene expression in a subacute scierosing panencephalitis ceil line fluorographic detection of radioactivity in polyacrylamide gels with the water-soluble fluor, sodium salycilate ribosomal initiation from an acg codon in the sendai virus p/c mrna nucleotide sequence of the gene encoding the matrix protein of a recent measles virus isolate nucleotide sequence heterogeneity of an rna phage population measles virus nucleic acid sequences in human brain studies on an attenuated measles virus vaccine: techniques for assay of effects of vaccination extensive editing of cytochrome c oxidase ill transcript in trypanosoma brucei antiviral antibody reacting on the plasma membrane alters measles virus expression inside the cell quantitative basic residue requirements in the cleavageactivation site of the fusion glycoprotein as a determinant of virulence for newcastle disease virus polytranscripts of sendai virus do not contain intervening polyadenylate sequences measles virus proteins in the brain tissue of patients with subacute sclerosing panencephalitis measles and subacute sclerosing panenecephalitis virus protein: lack of antibodies to the m protein in patients with subacute sclerosing panenecephalitis characterization of cloned measles virus mrnas by in vitro trancription, translation, and immunoprecipitation evolution of multiple genome mutations during long-term persistent infections by vesicular stomatitis virus does the higher order structure of the influenza virus ribonucleoprotein guide sequence rearrangements in influenza viral rna? in vivo rna-rna recombination of coronavirus in mouse brain the mechanism of rna recombination in poliovirus comparison of initiation of protein synthesis in procaryotes, eucaryotes, and organelles persistence of rna viruses in the central nervous system cleavage of structural proteins during the assembly of the head of bacteriophage t virological aspects of measles virus induced encephalomyelitis in lewis and en rats endoproteolytic cleavage of ~ is required for the activation of human immunodeficiency virus a single amino acid substitution in a histidine-transport protein drastically alters its mobility in sodium dodecyl sulfate-polyacrylamide gel electrophoresis measles virus matrix protein detected by immune fluorescence with monoctonal antibodies in the brain of patients with subacute sclerosing panencephalitis vesicular stomatitis virus defective interfering particles can contain extensive genomic sequence rearrangements and base substitutions characterization of the measles virus isolated from the brain of a patient with immunosuppressive measles encephalitis localization of p np, and m proteins on sendai virus nucleocapsids using immunogold labeling. virology antibodies against sendai virus l protein: distribution of the protein in nculeocapsids revealed by immunoelectron microscopy the nucleotide sequence of the mrna encoding the fusion protein of measles virus (edmonston strain): a comparison of fusion proteins from several different paramyxoviruses immunologic and virologic studies of measles inclusion body encephalitis in an immunosupressed host: the relationship to subacute sclerosing panencephalitis virus protein changes and rna termini alterations evolving during persistent infection infective substructures of measles virus from acutely and persistently infected cells a procedure for selective full length cdna cloning of specific rna species preliminary tests of a highly attenuated measles vaccine editing of kinetoplastid mitochondrial mrnas by uridine addition and deletion generates conserved amino acid sequences and aug initiation codons rapid degradation restricts measles virus matrix protein expression in a subacute sclerosing panencephalitis cell line chemical mutagenesis direct method for quantitalion of extreme polymerase error frequencies at selected single base sites in viral rna rapid evolution of rna viruses subacute sclerosing panencephalitis isolation and characterization of the measles virus f, polypeptide: comparison with other paramyxovirus fusion proteins influenza virus pathogenicity: the pivotal role of hemagglutinin differences between the intracellular polypeptides of measles and subacute sclerosing panencephalitis virus role of viruses in chronic neurological diseases we thank charles weissmann for helpful discussions, bert rima for communicating unpublished data, lsidro ballart for part of the m gene sequence of case mf, hugh pelham, pramod yadava, and deborah maguire for critical comments on the manuscript, and fritz ochsenbein for the photographs. this work was supported by grant . - of the schweizerische nationalfonds, by the kanton ziirich, and by the deutsche forschungsgemeinschaft.the costs of publication of this article were defrayed in part by the payment of page charges. this article must therefore be hereby marked "adwtisement" in accordance with u.s.c. section solely to indicate this fact.received may , . key: cord- -ewnjgps authors: strauss, james h; strauss, ellen g title: virus evolution: how does an enveloped virus make a regular structure? date: - - journal: cell doi: . /s - ( ) - sha: doc_id: cord_uid: ewnjgps nan e is ‫ ف‬ amino acids in size, of which about residues form the ectodomain, whereas m is only about ancestral source. they propose a fit of e into the cryo- residues long, of which about residues are present em density of semliki forest virus determined to Å in the ectodomain. thus, one can imagine that an immaresolution (mancini et al., ) . the paper by pletnev ture flavivirion containing prm (about residues) et al. ( ) shows that the bulk of e does not contribute rather than m might more resemble the alphavirus structo the outer portions of the spike but, instead, forms a ture, with short projecting spikes, but cleavage to m layer closely apposed to the lipid bilayer, analogous to removes the spikes. the position of e in the flavivirion. they also show that the evolution of enveloped viruses e projects upward to the full-length of the spike. thus, the parallels between the assembly of alphaviruses and e forms what has been called the skirt that surrounds flaviviruses and the similarities in structure revealed by the lipid bilayer and part of the lower domains of the the present studies suggest that an enveloped virus spikes, whereas e forms the projecting part of the with an icosahedral structure arose long ago and has spike. the absence of spikes in flaviviruses could then diverged into these two familes. many other enveloped be due to a difference between the cleaved e and m. viruses whose structures are more or less known use quite different assembly mechanisms ( virus evolution virus taxonomy: seventh report of the international committee on taxonomy of viruses key: cord- -alv uk authors: mellman, ira; simons, kai title: the golgi complex: in vitro veritas? date: - - journal: cell doi: . / - ( ) -a sha: doc_id: cord_uid: alv uk nan iplex has proved to be among the more challenging probllems in cell biology. the last several years have turned out ito be particularly exciting in this respect since they have iyielded new insights and ideas at an increasingly rapid ipace. this period of advance has largely been due to the idevelopment of powerful new biochemical, morphological, #and genetic approaches to unraveling the complexities of 'this organelle. while much remains to be discovered, the iproblem now is how to integrate this wealth of information. 'to see if this is possible, we will first summarize how the lslolgi is commonly believed to work and then evaluate the lstrength of the evidence that underlies these views. present view of the golgi 'the golgi complex is essentially a carbohydrate factory. in