key: cord-0759471-fydixc4b authors: de Araujo, Talita Stelling; Barbosa, Glauce Moreno; Sanches, Karoline; Azevedo, Jéssica M.; dos Santos Cabral, Katia Maria; Almeida, Marcius S.; Almeida, Fabio C. L. title: The (1)H, (15)N, and (13)C resonance assignments of the N-terminal domain of the nucleocapsid protein from the Middle East respiratory syndrome coronavirus date: 2021-04-29 journal: Biomol NMR Assign DOI: 10.1007/s12104-021-10027-6 sha: 47098cee6ef7ffb1e67db506a34d8170262cd27b doc_id: 759471 cord_uid: fydixc4b During the past 17 years, the coronaviruses have become a global public emergency, with the first appearance in 2012 in Saudi Arabia of the Middle East respiratory syndrome. Among the structural proteins encoded in the viral genome, the nucleocapsid protein is the most abundant in infected cells. It is a multifunctional phosphoprotein involved in the capsid formation, in the modulation and regulation of the viral life cycle. The N-terminal domain of N protein specifically interacts with transcriptional regulatory sequence (TRS) and is involved in the discontinuous transcription through the melting activity of double-stranded TRS (dsTRS). During the past 17 years the three coronaviruses, severe acute respiratory syndrome (SARS) in November 2002, Middle East respiratory syndrome (MERS) in April 2012, and more recently the coronavirus disease in December 2019, have become a global public emergency (Jiang et al. 2020; Singhal 2020) . The MERS-CoV was first identified in Saudi Arabia when isolated from an adult patient lung who was diagnosed with severe pneumonia and died of multiorgan failure (Nguyen et al. 2019) . MERS-CoV, like SARS-CoV and SARS-CoV-2, is a member of the Coronaviridae family of the order Nidovirales. It is a large single-strand positive-sense RNA with 30 kb which encodes four structural proteins: spike (S), envelope (E), membrane (M), and nucleocapsid (N) . The N protein is the most abundant in infected cells (Carlson et al. 2020) . It is composed of two domains: the dimerization C-terminal domain (N-CTD) and the RNA binding N-terminal domain (N-NTD). The N-CTD and N-NTD are linked by an intrinsically disordered region, which contains the Arg-Ser-rich region (SR region) and the phosphorylation site (Taskin Tok et al. 2017; Nguyen et al. 2019) . This multifunctional phosphoprotein is involved in the capsid formation, in the modulation and regulation of the viral life cycle. N protein is directly involved in the discontinuous transcription process, acting as an RNA chaperone (Huang et al. 2004) . Coronaviruses are among the largest RNA viruses and they undergo a unique discontinuous transcription of the viral RNA into subgenomic mRNAs (sgmRNAs). At the 5′ end of the genome is found the leader transcriptional regulatory sequence (TRS-L) and at the 5′ end of each subgenomic RNA, the body transcriptional regulatory sequence (TRS-B). When the TRS-B is copied during the transcription process, the nascent negative-strand RNA is transferred to the TRS-L portion through a template switch, finalizing the transcription process. The N-NTD domain of N protein has been reported to specifically interact with TRS and catalyse * Fabio C. L. Almeida falmeida@bioqmed.ufrj.br the template switch acting in the melting activity of dsTRS (Grossoehme et al. 2009 ). Our group is involved in the study of the mechanism of specific recognition and melting activity of the N-terminal domain of human betacoronaviruses (Caruso et al. 2020; de Luna Marques et al. 2021 ). This work is part of an international effort to combat the Covid-19 pandemic (https:// covid 19-nmr. de/, (Altincekic et al. 2021 ). Here we report the 1 H, 15 N, and 13 C backbone and side-chain resonance assignments of the N-NTD domain of MERS-CoV without the SR region (N-NTD) and containing the SR region (N-NTD-SR). These assignments are fundamental to obtain structural information on the N-NTD and its Ser-Arg-rich region which in turn will contribute to the better understanding of coronavirus diseases. Two distinct constructs of MERS-CoV protein N were synthesized. The first one contained only the N-terminal domain of MERS-CoV protein N comprising residues 35 to 169 (N-NTD domain), and the other one including, besides the N-NTD domain, the Arg-Ser-rich sequence from residue 170 to 202 (N-NTD-SR domain). Both proteins were subcloned between NdeI and BamHI restriction sites in plasmid pET28a by Genscript Company. Escherichia coli BL21 (DE3) was transformed with pET28a. One colony was picked and transferred to Luria Bertani (LB) medium. The bacteria were grown in minimal medium (M9) containing 15 NH 4 Cl (1 g/L) and 13 C-glucose (3 g/L) for isotopic labeling and kanamycin (30 µg/mL) for bacterial selection. The protein expression was induced with 0.2 mM IPTG (isopropyl β-D-thiogalactoside), overnight at 18ºC. Cells were centrifuged and the pellet was disrupted by ultrasonication in lysis buffer (50 mM Tris-HCl pH 8.0 containing 500 mM NaCl, 20 mM imidazole, 5% glycerol, 0,01 mg/ml DNAse and 5mL SigmaFast protease inhibitor cocktail tablet 1x diluted). The lysate was centrifuged, and the supernatant was applied to a HisTrap FF column (GE Healthcare Life Sciences). The N-terminal domains of N protein were purified by nickel affinity chromatography, using washing buffer A (50 mM Tris-HCl, 500 mM NaCl, 20 mM imidazole, pH 8.0) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 500 mM imidazole, pH 8.0). For His-tag removal the protein was cleaved overnight with TEV protease (TEV:protein 1:30 molar ratio) and the mixture was dialyzed against dialysis buffer (50 mM Tris-HCl pH 7.5, 0.5 mM EDTA and 1 mM DTT). After dialysis, a new cycle of nickel affinity chromatography was performed to improve the purity of the N-terminal domain of MERS protein N and remove the tag cleaved by TEV. The sample containing the protein was concentrated at 5000 g, 10 min, in Amicon Ultra 15 10,000 MWCO, in the presence of PMSF 0.5 mM. The buffer of fractions containing the purified protein was changed by gel filtration chromatography (Superdex 75 column) using the buffer 20 mM sodium phosphate, 50 mM NaCl, 500 µM PMSF, 3 mM sodium azide, and 3 mM EDTA, pH 5.5. At the end of gel filtration, the sample was concentrated using an Amicon and 0.5 mM PMSF, 3 mM EDTA, and 3 mM azide was added to the sample. The samples for NMR were in 20 mM sodium phosphate, 50 mM NaCl, 500 µM PMSF, and 3 mM sodium azide. For all NMR experiments, we added 5% (v/v) D 2 O to the sample. The triple resonance NMR spectra were acquired at 298 K on a Bruker 800 MHz AVANCE III spectrometer equipped with a pulse-field Z-axis gradient triple-resonance probe. We assigned the backbone resonances of 15 N-1 H-HSQC spectrum (Fig. 1) through the triple resonance experiments HNCO, HNCA, CBCA(CO)NH, HNCACB, and HBHA(CO)NH (Whitehead et al. 1997) . We assigned the side-chain resonance through 13 C-HSQC, (H)CCH-TOCSY, HCCH-TOCSY (Kay et al. 1993) , and 15 N and 13 C-NOESY-HSQC (for both aliphatic and aromatic regions) experiments. The NOESY spectra were acquired at 298 K on a Bruker 900 MHz AVANCE IIIHD spectrometer equipped with pulse-field Z-axis gradient triple-resonance probes. For all experiments, we used the chemical shift of water proton as an internal reference for 1 H while 13 C and 15 N chemical shifts were referenced indirectly to water (Wishart et al. 1995) . For the triple resonance measurements, we used non-uniform sampling (NUS) of the NMR data based on a 13% Poisson gap sampling schedule (Hyberts et al. 2012) . The iterative soft threshold method was used for the spectral reconstruction (Hyberts et al. 2012) . We processed the data using the NMRPipe software (Delaglio et al. 1995) and analysed it with CCPNMR Analysis (Vranken et al. 2005 ) both available on NMRbox (Maciejewski et al. 2017 ). For the MERS-CoV N-NTD domain, we assigned 93.8% of the backbone nuclei ( 13 Cα, 13 CO, Hα, amide HN, and 15 N). We have a total of 96.3% 13 Cα and 92.8% Hα. For the 13 CHn aliphatic side chain moieties of the protein, 67,1% of 13 C and 70% of 1 H were assigned. For the 13 CHn aromatic side chain moieties of the protein, 40% of 13 C and 80.9% of 1 H were assigned. We assigned 98.3% Cβ and 94.5% Hβ. We assigned 123 amide 1 HN (95.1%), 136 15 N (86%) and 136 13 CO (86%). For the MERS-CoV N-NTD-SR domain, we assigned 87.5% of the backbone nuclei ( 13 Cα, 13 CO, Hα, amide HN, and 15 N). We have a total of 95.3% 13 Cα and 86.9% Hα. For the 13 CHn aliphatic side chain moieties of the protein, 57.7% of 13 C and 60.8% of 1 H were assigned. For the 13 CHn aromatic side chain moieties of the protein, 27.5% of 13 C and 57.1% of 1 H were assigned. We assigned 96% Cβ and 84.9% Hβ. We assigned 156 amide 1 HN (89.7%), 171 15 N (81.9%) and 171 13 CO (80.1%). From the resonance assignment we could compare the chemical shift derived order parameter (S 2 ), from the random coil index (Berjanskii and Wishart 2005) and secondary structure prediction using TalosN (Shen and Bax 2013) . It is interesting to note subtle differences of backbone flexibility when the N-NTD and N-NTD-SR are compared. The Ser-Arg-rich region is flexible but contains a more ordered region around residue 183 (Fig. 2a) . We observed secondary structure elements compatible with (Papageorgiou et al. 2016 ) and a oneresidue shift for β-strand 1 when N-NTD and N-NTD-SR are compared (Fig. 2b, c) . Further studies are necessary to understand these structural and dynamical features. Protein dynamics and secondary structure prediction. a Random-coil index order parameter as a function of the residue number for MERS-CoV N-NTD (red) and N-NTD-SR (black). b TalosN secondary structure prediction of MERS-CoV N-NTD-SR as a function of residue number. c TalosN secondary structure prediction of MERS-CoV N-NTD as a function of residue number. In blue the predicted probabilities for helix and in green for extended structure (β-strand) as a function of the residue number. In the top, the green rectangles represent the β-strands and the blue the helices, corresponding to the secondary structure in the crystal structure [PDB 6KL2 (Lin et al. 2020)] Large-scale recombinant production of the SARS-CoV-2 proteome for high-throughput and structural biology applications A simple method to predict protein flexibility using secondary chemical shifts Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions Dynamics of the N-terminal domain of SARS-CoV-2 nucleocapsid protein drives dsRNA melting in a counterintuitive tweezer-like mechanism 2021) 1 H, 15 N and 13 C resonance assignments of the N-terminal domain of the nucleocapsid protein from the endemic human coronavirus HKU1 NMRPipe: a multidimensional spectral processing system based on UNIX pipes Coronavirus N protein N-terminal domain (NTD) specifically binds the transcriptional regulatory sequence (TRS) and melts TRS-cTRS RNA duplexes Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein Application of iterative soft thresholding for fast reconstruction of NMR data non-uniformly sampled with multidimensional Poisson Gap scheduling Review of the clinical characteristics of Coronavirus disease 2019 (COVID-19) A gradient-enhanced HCCH-TOCSY experimentfor recording side-chain 1 H and 13 C correlations in H 2 Osamples of proteins Structure-based stabilization of non-native protein-protein interactions of coronavirus nucleocapsid proteins in antiviral drug design NMRbox: a resource for biomolecular NMR computation Structure and oligomerization state of the C-terminal region of the Middle East respiratory syndrome coronavirus nucleoprotein Structural characterization of the N-terminal part of the MERS-CoV nucleocapsid by X-ray diffraction and small-angle X-ray scattering Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks A review of coronavirus disease-2019 (COVID-19) Structures and functions of coronavirus proteins: molecular modeling of viral nucleoproteininternational journal of virology & infectious diseases international journal of virology & infectious diseases The CCPN data model for NMR spectroscopy: development of a software pipeline Double and triple resonance NMR methods for protein assignment ) 1 H, 13 C and 15 N chemical shift referencing in biomolecular NMR 1 National Center of Nuclear Magnetic Resonance (CNRMN), Acknowledgements This work was supported by Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). The authors KMSC and TSA gratefully acknowledge the post-doctoral fellowship and financial support from FAPERJ (grants E-26/260.005/2020 to TSA and E-26/260.002/2020 to KMSC). We acknowledge the National Center of Nuclear Magnetic Resonance (CNRMN) and the Protein Advance Biochemistry platform (PAB). We also acknowledge the Covid-19 NMR Consortium (https:// covid 19-nmr. de/) for providing an excellent environment for scientific discussions. The authors declare that there are no conflicts of interest.