key: cord-287349-1zcq7kzx authors: Chen, James; Malone, Brandon; Llewellyn, Eliza; Grasso, Michael; Shelton, Patrick M.M.; Olinares, Paul Dominic B.; Maruthi, Kashyap; Eng, Ed T.; Vatandaslar, Hasan; Chait, Brian T.; Kapoor, Tarun; Darst, Seth A.; Campbell, Elizabeth A. title: Structural basis for helicase-polymerase coupling in the SARS-CoV-2 replication-transcription complex date: 2020-07-28 journal: Cell DOI: 10.1016/j.cell.2020.07.033 sha: doc_id: 287349 cord_uid: 1zcq7kzx Summary SARS-CoV-2 is the causative agent of the 2019-2020 pandemic. The SARS-CoV-2 genome is replicated and transcribed by the RNA-dependent RNA polymerase holoenzyme (subunits nsp7/nsp82/nsp12) along with a cast of accessory factors. One of these factors is the nsp13 helicase. Both the holo-RdRp and nsp13 are essential for viral replication and are targets for treating the disease COVID-19. Here we present cryo-electron microscopic structures of the SARS-CoV-2 holo-RdRp with an RNA template-product in complex with two molecules of the nsp13 helicase. The Nidovirus-order-specific N-terminal domains of each nsp13 interact with the N-terminal extension of each copy of nsp8. One nsp13 also contacts the nsp12-thumb. The structure places the nucleic acid-binding ATPase domains of the helicase directly in front of the replicating-transcribing holo-RdRp, constraining models for nsp13 function. We also observe ADP-Mg2+ bound in the nsp12 N-terminal nidovirus RdRp-associated nucleotidyltransferase domain, detailing a new pocket for anti-viral therapeutic development. Nsp13 is an SF1B helicase, which translocate on single-stranded nucleic acid in the 5 ->3' 206 direction (Saikrishnan et al., 2009) . In vitro studies confirm this direction of translocation for the 207 nidovirus helicases (Adedeji et al., 2012; Bautista et al., 2002; Seybert 208 et al., 2000a Seybert 208 et al., , 2000b Tanner et al., 2003) . Unless the interaction of nsp13 with the holo-RdRp 209 alters the unwinding polarity, which seems unlikely, the structural arrangement observed in the 210 nsp13-RTC ( Figure 2D) Figure 5B ). In place of the bridge helix, the viral RdRp has 231 conserved motif F [SARS-CoV-2 nsp12 residues 544-555; (Bruenn, 2003) ], which comprises a β-232 hairpin loop. Motif F directs the t-RNA to the top, while underneath motif F is a channel that 233 appears able to accommodate single-stranded nucleic acid. The analogous structural 234 arrangement leads us to propose that the SARS-CoV-2 RdRp may backtrack, generating a single-235 stranded RNA segment at the 3'-end that would extrude out the RdRp secondary channel 236 Table S1 ; Video S1). In the structure, the primary 243 interaction determinant of the helicase with the RTC occurs between the nsp13-ZBDs and the 244 nsp8-extensions. Both of these structural elements are unique to nidoviruses, and the 245 interaction interfaces are conserved within αand β-CoV genera (Figure 3 ), indicating that this 246 interaction represents a crucial facet of SARS-CoV-2 replication/transcription. A protein-protein 247 interaction analysis for the SARS-CoV-1 ORFeome (which recapitulates the nsp13-RTC 248 interactions observed in our structure) identified nsp8 as a central hub for viral protein-protein 249 interactions (Brunn et al., 2007) . The structural architecture of nsp8a and nsp8b, with their long 13 N-terminal helical extensions, provide a large binding surface for the association of an array of 251 replication/transcription factors ( Figure 2C , E). 252 Our structure reveals ADP-Mg 2+ occupying the NiRAN domain active-site ( Figure 4C) , 253 presumably because the sample was incubated with ADP-AlF 3 prior to grid preparation. The 254 ADP makes no base-specific interactions with the protein; nsp12-NiRAN-H75 forms a cation-255 π interaction with the adenine base ( Figure 4C ), but this interaction is not expected to be 256 strongly base-specific, and structural modeling does not suggest obvious candidates for base-257 specific interactions. The position corresponding to H75 in the NiRAN domain A n alignment is 258 not conserved ( Figure 4A ), suggesting that; i) this residue is not a determinant of base-259 specificity for the NiRAN domain active site, ii) that the NiRAN domain base-specificity varies 260 among different nidoviruses, or iii) that NiRAN domains in general do not show base-specificity 261 in their activity. The NiRAN domain of the EAV-RdRp appeared to prefer U or G for its activity 262 (Lehmann et al., 2015a) . We note that the NiRAN domain enzymatic activity is essential for viral 263 propagation but its target is unknown (Lehmann et al., 2015a) . Further experiments will be 264 required to understand more completely the NiRAN domain activity, its preferred substrate, 265 and its in vivo targets, and these may vary among different nidoviruses. Our results provide a 266 structural basis for i) biochemical, biophysical, and genetic experiments to investigate these 267 questions, and ii) a platform for anti-viral therapeutic development. 268 Our analysis comparing the viral RdRp with cellular DdRps revealed a remarkable 269 structural similarity at the polymerase active sites -immediately downstream of each 270 polymerase active site is a conserved structural element that divides the active site cleft into 271 two compartments, directing the downstream nucleic acid template into one compartment and 272 In the cellular DdRps, the conserved structural element that divides the active site cleft is the 274 bridge helix (Lane and Darst, 2010) , and the secondary channel serves to allow NTP substrates 275 to access the DdRp active site and to also accommodate the single-stranded 3'-RNA fragment 276 generated during backtracking ( Figure 5A ). 277 In the viral RdRp, the downstream strand-separating structural element is the motif F β-278 hairpin loop. As for multisubunit DdRps, the RdRp secondary channel is perfectly positioned to 279 accommodate backtracked RNA ( Figure 5B ). Based on this structural analogy, we propose that 280 the viral RdRp may undergo backtracking and that the single-stranded 3'-RNA fragment so 281 generated would extrude out the viral RdRp secondary channel ( Figure 5B ). We note that 282 backtracking of Φ6 and poliovirus RdRps has been observed experimentally (Dulin et al., 2015 (Dulin et al., , 283 2017 . 284 Ignoring sequence variation, the energetics of backtracking by the cellular DdRps are 285 close to neutral since the size of the melted transcription bubble and the length of the 286 RNA/DNA hybrid in the active site cleft are maintained (any base pairs disrupted by 287 backtracking are recovered somewhere else). For the SARS-CoV-2 RdRp, the arrangement of 288 single-stranded and duplex nucleic acids during replication/transcription in vivo is not known, 289 but in vitro the RdRp synthesizes p-RNA from a single-stranded t-RNA, resulting in a persistent 290 upstream p-RNA/t-RNA hybrid. In this case backtracking is energetically disfavored since it only 291 shortens the product RNA duplex without recovering duplex nucleic acids somewhere else. 292 However, our structural analysis of the nsp13-RTC indicates that nsp13.1 can engage with the 293 downstream single-stranded t-RNA ( Figures 2D, S6C) . Translocation of the helicase on this RNA 294 strand would proceed in the 5'->3' direction, in opposition to the 3'->5' translocation of the 295 RdRp on the same RNA strand. This aspect of helicase function could provide the NTP-296 dependent motor activity necessary to backtrack the RdRp. In cellular organisms, DdRp 297 backtracking plays important roles in many processes, including the control of pausing during 298 transcription elongation, termination, DNA repair, and fidelity (Nudler, 2012) . Two potential 299 roles for backtracking in SARS-CoV-2 replication/transcription include: 1) fidelity and 300 2) template-switching during sub-genomic transcription. 301 Backtracking by the cellular DdRps is favored when base pairing in the RNA/DNA hybrid 302 is weakened by a misincorporated nucleotide in the RNA transcript (Nudler et al., 1997) . The efficiency with which the holo-RdRp can negotiate downstream obstacles to 341 elongation is unknown. Our structure suggests that the nsp13 helicase could act in the 5'->3' 342 direction on the t-RNA to disrupt stable RNA secondary structures or downstream RNA binding 343 proteins ( Figure 6B ), both of which could be significant impediments to RNA elongation 344 ( Figure 6B ). The helicase may function in this role distributively in order to avoid interfering 345 with RdRp translocation. Alternatively, in the case of a fully duplex RNA template, the helicase 346 could act processively to unwind the downstream duplex RNA, much like replicative helicases, 347 such as DnaB in Escherichia coli, processively unwind the DNA duplex in front of the replicative 348 DNA polymerase (Kaplan and O'Donnell, 2002) . 349 Finally, CoV transcription includes a discontinuous step during the production of sub-350 genomic RNAs (sg-transcription; Figure 6C ) that involves a remarkable template-switching step 351 unique to nidoviruses (Sawicki and Sawicki, 1998) . The process produces sg-RNAs that are 5'-352 and 3'-co-terminal with the virus genome. In this process, transcription initiates from the 3'-353 poly(A) tail of the +-strand RNA genome [cyan RNA in Figure 6C The oligonucleotides used in this study are listed in Table S2 . All constructs were verified by 704 sequencing (GeneWiz). 705 Nsp7/8. The coding sequences of the E. coli codon-optimized SARS-CoV-2 nsp7 and nsp8 genes 706 (gBlocks from Integrated DNA Technologies) were cloned into a pCDFDuet-1 vector (Novagen). 707 Nsp7 bore an N-terminal His 6 -tag that was cleavable with PreScission protease (GE Healthcare 708 Life Sciences). Figure S3D ) for the nsp13 2 -RTC particles indicated that the map (and resolution 931 esimations) were corrupted by severe particle orientation bias. 932 Nsp13-RTC (CHAPSO). The entire dataset consisted of 4,358 motion-corrected images with 933 1,447,307 particles ( Figure S4A ). Particles were sorted using cryoSPARC 2D classification 934 (N=100), resulting in 344,953 curated particles. Initial models (Seed 1: complex, Seed 2: decoy 935 1, Seed 3: decoy 2) were generated using cryoSPARC Ab initio Reconstruction on a subset of the 936 particles (10,509 particles from first 903 images). Particles were further curated using Seeds 1-3 937 as 3D templates for cryoSPARC Heterogeneous Refinement (N=3), then re-extracted with a 938 boxsize of 320 px, and followed by another round of Heterogeneous Refinement (N=3) using 939 Seed 1 as a template. The resulting 91,058 curated particles were sorted into three classes 940 using cryoSPARC Heterogeneous Refinement (N=3). Each class was further sorted using 941 cryoSPARC Ab initio Reconstruction (N=3) to separate distinct 3D classes. Using these classes as 942 references for Heterogeneous Refinement (N=6), multi-reference classification was performed 943 on the 91,058 curated particles. Classification revealed three unique classes: (1) nsp13-RTC, 944 (2) nsp13 2 -RTC, (3) (nsp13 2 -RTC) 2 . Particles within each class were further processed using 945 Structural Basis of Transcription: RNA Polymerase Backtracking and Its 1021 PHENIX: a comprehensive Python-based 1025 system for macromolecular structure solution Mechanism of Nucleic Acid Unwinding by SARS-CoV Helicase Coronavirus Susceptibility to the Antiviral Remdesivir Is Mediated by the Viral Polymerase and the Proofreading Exoribonuclease Transcription Regulatory Sequences and 1042 mRNA Expression Levels in the Coronavirus Transmissible Gastroenteritis Virus Functional Properties of 1046 the Predicted Helicase of Porcine Reproductive and Respiratory Syndrome Virus The global phosphorylation landscape of SARS-CoV-2 infection RNA 3'-end 1061 mismatch excision by the severe acute respiratory syndrome coronavirus nonstructural protein 1062 nsp10/nsp14 exoribonuclease complex A structural and primary sequence comparison of the viral RNA-dependent 1065 RNA polymerases One number does not fit all: mapping local 1072 variations in resolution in cryo-EM reconstructions Aluminofluoride and beryllofluoride complexes: new phosphate analogs in 1075 enzymology Eliminating effects of particle adsorption 1078 to the air/water interface in single-particle cryo-electron microscopy_ Bacterial RNA 1079 polymerase and CHAPSO MolProbity: all-atom structure validation for 1083 macromolecular crystallography AAA protein spastin using active site mutations Structural basis for the regulatory function of a complex zinc-binding domain in a 1094 replicative arterivirus helicase resembling a nonsense-mediated mRNA decay helicase Coronaviruses: An RNA proofreading machine regulates replication fidelity and diversity Molecular dynamics simulatiosn related to SARS-CoV-2. D.E. Shaw 1102 Research Technical Data The Predicted Metal Binding Region of the Arterivirus Helicase Protein Is Involved in Subgenomic mRNA Synthesis Genome Replication, and Virion Biogenesis A novel protein kinase-1113 like domain in a selenoprotein, widespread in the tree of life Backtracking behavior in viral RNA-dependent RNA polymerase provides the basis for a second 1117 initiation site Signatures of Nucleotide Analog Incorporation by an RNA-Dependent RNA Polymerase Revealed Using High-Throughput Magnetic Tweezers. Cell Hepatitis Virus Replication Is Decreased in nsp14 Exoribonuclease Mutants Infidelity of SARS-CoV Nsp14-exonuclease mutant virus 1130 replication is revealed by complete genome sequencing Coot: model-building tools for molecular graphics Biochemical Aspects of Coronavirus 1136 Replication and Virus-Host Interaction Promoting elongation with transcript cleavage stimulatory 1142 factors Nidovirales: Evolving the 1145 largest RNA virus genome Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase 1149 from severe acute respiratory syndrome coronavirus 2 with high potency Virus Taxonomy. In Family Coronaviridae Crystal structure of Middle East respiratory syndrome coronavirus helicase From SARS to MERS: 10 years of research on highly 1162 pathogenic human coronaviruses Structure 1165 of replicating SARS-CoV-2 polymerase Dali server update Human Coronavirus 229E Nonstructural Protein 13: 1170 Characterization of Duplex-Unwinding, Nucleoside Triphosphatase, and RNA 5′-Triphosphatase Enzymatic Activities Associated with Severe Acute Respiratory Syndrome Coronavirus Helicase Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus 1179 Nsp13 upon ATP hydrolysis Structural basis of transcription arrest by coliphage HK022 nun in an 1183 DnaB Drives DNA Branch Migration and Dislodges 1186 Structure of the SARS-CoV nsp12 polymerase bound 1192 to nsp7 and nsp8 co-factors Transcriptional arrest: Escherichia coli RNA 1195 polymerase translocates backward, leaving the 3' end of the RNA intact and extruded RNA polymerase switches between inactivated and 1200 activated states By translocating back and forth along the DNA and the RNA Molecular Evolution of Multisubunit RNA Polymerases: 1204 Structural Analysis Cooperative 1207 translocation enhances the unwinding of duplex DNA by SARS coronavirus helicase nsP13 Discovery of an essential nucleotidylating activity associated with a newly delineated conserved 1213 domain in the RNA polymerase-containing protein of all nidoviruses What we know but 1217 do not understand about nidovirus helicases The EMBL-EBI search and sequence analysis tools APIs in 1221 2019 Bayesian deconvolution of mass and ion mobility spectra: from binary interactions to 1229 polydisperse ensembles Discovery of an RNA virus 3'->5' exoribonuclease that is critically involved in 1233 coronavirus RNA synthesis Collaboration gets the most out of software Discovery of the first insect nidovirus, a missing evolutionary 1240 link in the emergence of the largest RNA virus genomes Cell the register of transcription by preventing backtracking of RNA polymerase Structure 1252 and function of the transcription elongation factor GreB bound to bacterial RNA polymerase Sequence 1256 requirements for RNA strand transfer during nidovirus discontinuous subgenomic RNA 1257 synthesis Analyzing resistance to 1264 design selective chemical inhibitors for AAA proteins cryoSPARC: algorithms for 1267 rapid unsupervised cryo-EM structure determination High-Throughput Deconvolution of Native Mass Spectra A 1274 planarian nidovirus expands the limits of RNA genome size Mechanistic basis of 1277 5'-3' translocation in SF1B helicases Advances in Experimental Medicine and Biology RELION: implementation of a Bayesian approach to cryo-EM structure 1283 determination Sequence logos: a new way to display consensus 1286 sequences Biochemical Characterization 1289 of the Equine Arteritis Virus Helicase Suggests a Close Functional The human coronavirus 229E superfamily 1 helicase has RNA and DNA duplex-unwinding activities with 5′-to-3′ polarity Remdesivir and SARS-CoV-2: structural requirements at 1302 both nsp12 RdRp and nsp14 Exonuclease active-sites Structure and Mechanism of 1305 Helicases and Nucleic Acid Translocases Coronaviruses lacking exoribonuclease 1308 activity are susceptible to lethal mutagenesis: evidence for proofreading and potential 1309 therapeutics Thinking Outside the Triangle: Replication Coronavirus RNA Synthesis and Processing Continuous and Discontinuous RNA 1318 Synthesis in Coronaviruses Protein AMPylation by an Evolutionarily Conserved 1322 One severe acute respiratory syndrome 1326 coronavirus protein complex integrates processive RNA polymerase and exonuclease activities The Severe Acute Respiratory Syndrome (SARS) Coronavirus Belongs to a Distinct Class of 5′ to 3′ Viral Helicases Transcriptional Fidelity and Proofreading by Identification and Characterization of a Human Coronavirus 229E Nonstructural Protein Associated RNA 3'-Terminal Adenylyltransferase Activity Structural basis of transcription: backtracked RNA polymerase II at 3.4 angstrom resolution Structural Basis for RNA Replication by the SARS-CoV-2 Polymerase Therapeutic efficacy of the small molecule GS-5734 against Ebola virus in rhesus monkeys Structural Basis of Transcription Nonstructural proteins 7 and 8 of feline coronavirus form a 2:1 heterotrimer that 1371 exhibits primer-independent RNA polymerase activity Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS CoV-2 by remdesivir Insights into SARS-CoV 1378 transcription and replication from the structure of the nsp7-nsp8 hexadecamer Crystal 1382 structure of Thermus aquaticus core RNA polymerase at 3.3 A resolution The nsp1, nsp13, and M proteins contribute to the hepatotropism of murine 1386 coronavirus JHM.WU MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron 1390 microscopy A pneumonia outbreak associated with a new coronavirus of probable bat 1394 origin An insect nidovirus emerging from a primary tropical 1398 rainforest New tools for automated high-resolution cryo-EM structure determination in 1402 RELION-3 Sequence Motifs Involved in the 1405 Regulation of Discontinuous Coronavirus Subgenomic RNA Synthesis RTC) with nsp13 helicases • The nsp13 NTPase domains sit in front of the RCT, constraining functional models • Nsp13 may drive RTC backtracking, thus impacting proofreading and templateswitching • Structural analysis of ADP-Mg 2+ -bound NiRAN domain, a potential antiviral target In brief Chen et al. present cryo-EM structures of the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp) holoenzyme (nsp7/nsp8/nsp12) containing an RNA template-product in complex with the viral helicase (nsp13). The work provides insight into the assembly and function of the multi-subunit protein machine and how We thank A. Aher, J. Berger, R. Landick and C. Rice for helpful