key: cord-300194-nsp53lv6 authors: Rath, Soumya Lipsa; Kumar, Kishant title: Investigation of the effect of temperature on the structure of SARS-Cov-2 Spike Protein by Molecular Dynamics Simulations date: 2020-06-19 journal: bioRxiv DOI: 10.1101/2020.06.10.145086 sha: doc_id: 300194 cord_uid: nsp53lv6 Statistical and epidemiological data imply temperature sensitivity of the SARS-CoV-2 coronavirus. However, the molecular level understanding of the virus structure at different temperature is still not clear. Spike protein is the outermost structural protein of the SARS-CoV-2 virus which interacts with the Angiotensin Converting Enzyme 2 (ACE2), a human receptor, and enters the respiratory system. In this study, we performed an all atom molecular dynamics simulation to study the effect of temperature on the structure of the Spike protein. After 200ns of simulation at different temperatures, we came across some interesting phenomena exhibited by the protein. We found that the solvent exposed domain of Spike protein, namely S1, is more mobile than the transmembrane domain, S2. Structural studies implied the presence of several charged residues on the surface of N-terminal Domain of S1 which are optimally oriented at 10-30 °C. Bioinformatics analyses indicated that it is capable of binding to other human receptors and should not be disregarded. Additionally, we found that receptor binding motif (RBM), present on the receptor binding domain (RBD) of S1, begins to close around temperature of 40 °C and attains a completely closed conformation at 50 °C. The closed conformation disables its ability to bind to ACE2, due to the burying of its receptor binding residues. Our results clearly show that there are active and inactive states of the protein at different temperatures. This would not only prove beneficial for understanding the fundamental nature of the virus, but would be also useful in the development of vaccines and therapeutics. Graphical Abstract Highlights Statistical and epidemiological evidence show that external climatic conditions influence the SARS-CoV infectivity, but we still lack a molecular level understanding of the same. Here, we study the influence of temperature on the structure of the Spike glycoprotein, the outermost structural protein, of the virus which binds to the human receptor ACE2. Results show that the Spike’s S1 domain is very sensitive to external atmospheric conditions compared to the S2 transmembrane domain. The N-terminal domain comprises of several solvent exposed charged residues that are capable of binding to human proteins. The region is specifically stable at temperatures ranging around 10-30° C. The Receptor Binding Motif adopts a closed conformation at 40°C and completely closes at higher temperatures making it unsuitable of binding to human receptors Severe Acute Respiratory Syndrome Coronavirus 2 or SARS-COV-2, attacks the cells of the human respiratory system. Recent studies have found that the virus also interacts with the cells of the digestive system, renal system, liver, pancreas, eyes and brain [1] . It is known to cause severe sickness and is fatal in many cases [2] . It is believed that the virus originated in bats, which act as the natural reservoir; subsequently it got transmitted to human. It then gradually spread across almost all the nations through aerial transmission resulting in one of the worst known global pandemic of this century [3] . SARS-COV-2 is one of the seven forms of coronaviruses that affect the human population. The other known coronaviruses include HCoV-229E, HCoV-OC43, SARS-CoV, HCoV-NL63, HCoV-HKU1 and MERS-CoV [4, 5] . Their infection varies from common cold to SARS, MERS or Covid19 [5] . These viruses have been observed to affect the human population predominantly during a particular season. For instance, the 2002 SARS infections began during the cold winters of November and after eight months, the number of reported cases became almost negligible [5] . Statistics show that countries with hot and humid weather conditions had lesser number of infectious cases of SARS [6] . However, MERS-COV, which was identified in Middle East regions, affected individuals during the summer [5] . Thus, the disease epidemiology suggests that the virus is found to be prominent in certain climatic conditions only. The viability of SARS-COV-2 was measured on different surfaces y hin et al who found that the virus droplets survived at ut qui kly dea tivated at elevated temperatures of 5 [7] . Smooth surfaces, plastics and iron show greater viability of the virus compared to that of paper, tissue, wood or cloth. Surgical masks had detectable viruses even on 7 th day [8, 9] . Soaps and disinfectants which disintegrate the virus membrane and structural proteins are a potent example of how the modulation of atmospheric conditions can affect the virus viability. Statistical reports by Cai et. al., and several others had shown that tropical countries like Malaysia, Indonesia or Thailand with high temperature and high relative humidity did not have major community outbreaks of SARS [6, [10] [11] . Although viruses cannot be killed like bacteria by autoclaving, temperature sensitivity of virus have been reported several times in the past. Seasonal Rhinoviruses ould not repli ate at C whereas -C is ideal for their survival in nasal cavity [12 nfluenza was found to e effe tive at a temperature around C whereas higher temperatures of 1 C resulted in clumping of viruses on cell surfaces [13, 14, 15] . Similarly, the viability of SARS virus that persisted for 5 days at temperatures ranging etween -C andhumidity was lost when the temperature was raised to C and 95% humidity [6] . When the virus is exposed to different temperature conditions, the initial interactions of the atmosphere occur with the structural proteins. There are four major structural proteins present on the virus, the Spike glycoprotein, the Envelope protein, the Membrane protein and the Nucleocapsid. Each of the proteins performs specific functions in receptor binding, viral assembly and genome release [16] . One of the first and largest structural proteins of the Coronavirus is the Spike glycoprotein [17] . The protein exists as a homotrimer where each monomer consists of 1273 amino acid residues ( Figure 1 ) and is intertwined with each other. Each monomer has two domains, namely S1 and S2 [18] . The S1 and S2 domains are cleaved at 6 a furin site by a host cell protease [18, 19] . The S1 domain lies predominantly above the lipid bilayer. The S2 domain, which is a class I transmembrane domain, travels across the bilayer and ends towards the inner side of the lipid membrane [18] . Figure 1 shows the two domains of the Spike glycoprotein. The S1 domain comprises of mostly beta pleated sheets. It can be further classified into Receptor Binding domain (RBD) and N-terminal Domain (NTD). The RBD binds to Angiotensin Converting Enzyme 2 (ACE 2) on the host cells [20] . It lies on the top of the complex, where around 14 residues from the RBD domain bind to the ACE2 receptor on the host protein [21, 22] . The NTD is the outermost domain that is relatively more exposed and lies on the three sides giving a triangular shape to the protein when viewed from top ( Figure 1) . The NTD has a galectin fold and is known to bind to the sugar moieties [21] . The S2 domain on the other hand is a transmem rane region with strong inter hain onding etween the residues t is mostly αhelical and forms a triangle when viewed from bottom, though there is no overlapping of the top and bottom triangles. Temperature is a very significant variable parameter for proteins because proteins respond differently in high and low temperature conditions. Many proteins have high thermal stability while others can unfold or even denature at high temperatures [23, 24] . During November, 2019, when the first out reak of ovid1 was reported the temperature in uhan hina was around 1 in the morning and at night Tropi al ountries such as India, where a large number of cases still persist, had over of temperature [25, 26] . Although statistical and experimental evidence show that temperature influences the activity and virulence of the virus, we still lack 7 the understanding of the molecular level changes that are taking place in the virus due to the different weather conditions. Till date, there is no concrete evidence on whether atmospheric conditions actually influence the structure of the virus. Here, by using all atom molecular dynamics (MD) simulations we explore the dynamics of the Spike glycoprotein of SARS-COV-2 at different temperatures. This is the first molecular study on the environmental influence on the protein structure. Results suggest that S1 domain is more flexible than S2. In the S1 domain, we observed the sensitivity of the receptor binding motif to different temperatures. We also found that the N-terminal domain of the protein has the potential of binding to different human receptors. The study will not only help us in understanding the nature of the virus but is also useful to design effective therapeutic strategies. The crystal structure of the Spike glycoprotein (PDB: 6VXX) was found to have 871 missing residues. Thus, for our study we considered the complete model of the trimeric Spike protein generated by Zhang et. al. and had a Template modeling score of 0.6 [27] . The model was devoid of N-acetyl glucosamine (NAG) sugar moieties which are known to bind and stabilize the protein. The envelope lipid bilayer was not considered in the work to avoid large system size in atomistic simulations. fter initial minimization and equili ration we generated five different systems having temperatures ranging from 1 to at an interval of 10 degrees. This was done to maintain the uniformity of the simulations, where temperature was the only variable that was different. n addition a temperature of was also imposed on the system to observe any 8 possible deformation in the structure of spike protein, although this high temperature is not realistic to imitate the environmental condition (Table S1 ). Production run for 200ns was carried out in isothermal isobaric (NPT) ensemble. After performing 200ns of classical Molecular dynamics simulations, the root mean square deviation (RMSD) of the trajectory, with respect to the starting structure, was calculated to check if the systems have attained stability. Figure S1 shows the complete RMSD of all the systems at different temperatures. It can be seen that the stability was attained within the first ns of the simulation time thus indi ating that the systems are well equili rated The R S values lie etween nm for all the systems with an ex eption at where a marginally higher RMSD was seen after 100ns of simulation time. At temperatures and a small rise in RMSD curves after 100ns of simulation time was observed. This implies that the Spike protein was more sta le at temperatures 1 and Since, the protein comprises of two distinct domains S1 and S2, we checked the RMSD of S1 and S2 domains individually, with respect to the starting structure, to understand the ause for higher R S values o served at and igure The R S values of S1 domain at and were found to e around nm nearly nm more than simulations at 1 and respe tively similar trend was o served in the R S of S domain ut the differen e in values was only 1 nm lthough in this study we haven't considered the bilayer lipid membrane of the SARS-COV-2 envelope inside which the Spike 9 glycoprotein resides, the S2 domain shows remarkable stability in its RMSD values ( Figure 2 ). The stability of the S2 domain can be conferred to the strong interchain interactions among the highly α-helical S2 domain. Since the Spike protein is a homotrimer, the S1 domain of individual domains was also checked to account for the difference in fluctuations. Figure 2 (c) -(e) shows the RMSD of S1 domain of chains A, B and C at different temperatures. In chain A, it can be clearly seen how the RMSD is quite high at temperatures of 30 C and respe tively t C however, the fluctuations are quite negligible and the system is very stable. Similarly, for chain B at 10 and the chains were stable. In the S1 domain of chain C, except for simulation at 40 C, at all other temperatures, the RMSD values were found to be quite low along the length of the simulation time. The above data indicates that the protein chains, especially the S1 domains are quite flexible around the temperatures of 20-40 C in comparison to low temperatures of 10 or high of simulation temperature rrespe tive of the presen e of the ilayer mem rane at different temperature conditions, the stalk of the Spike protein remains stable. B. Domain flexibility of S1 is more pronounced In order to identify the region on the Spike protein that causes the deviations in RMSDs, we The RMSF of S2 domain on the other hand shows marked stability compared to domain S1 ( Figure S4 ). This is in good agreement to our earlier observations of the RMSD of the S2 domain. Since it is a triple helical coil, the coiled-coil motif of the S2 domain which is further supported by three shorter helices supports domain stability [29] . However, the C-terminal residues 1125-1273 show greater flexibility compared to the rest of the domain. It should be noted that the C- 11 terminal region of the Spike glycoprotein is exposed towards the inner side of the envelope bilayer and does not participate in the interchain interactions. It also has a more relaxed packing compared to the rest of the S2 [28, 30] . Although, the NTD is not known to directly bind to the receptor, in Mouse hepatitis coronavirus, it was found that the NTD binds to a CEACAM1a receptor [31] . Similarly, vaccines developed against the NTD of Spike protein in mice, had shown that NTD can also be a potential therapeutic target [32, 33] . Moreover, comparison between Bovine coronavirus and Bovine hemagglutinin-esterase enzyme indicated close evolutionary link between the virus and the host proteins, which could facilitate attachment in the host cells [34] . We performed a Multiple The NTD is relatively more exposed to solvents and more susceptible to external environmental conditions. However unlike RBD, the NT doesn't have a defined open or losed onformation The coronavirus NTD is composed of three layered beta-sheet sandwich with 7, 3 and 6 antiparallel β strands in ea h layer making it a total of 1 eta stranded sheet with prominent β hairpin loops ( Figure S5 ). The crystal structures of Mouse Hepatitis Coronavirus (MHC) Spike protein and its re eptor shows that the β1 and β of the NTD are the binding motif for CECAM1a protein [31] . However, unlike the MHC NTD, the arrangement of strands in SARS-CoV-2 is in opposite direction. The upper layer of the beta sandwich is composed of beta strands β β β β β β1 β1 igure S The three prominent regions which are exposed to the solvent and capable of interacting with potential receptors are regions N-terminal β strand ββ β -β1 and β1 -β15 loop. Comparison of the NTD at different temperatures ( Figure 5) show differential arrangement of the solvent exposed loops. The loops are formed by residues from N-terminal β strand ββ β -β1 and β1 -β1 . The time averaged conformation of the loops after 200ns of simulation show that the loops are oriented close to each other at temperatures 10-30 C, however at 40 C and 50 C, they move farther away from each other. Since, there was similarity between the Ehprin A proteins that binds to the Ephrin A receptors; we compared the residues involved in protein- 13 protein interaction in the crystal structure of the human EphA4 ectodomain in complex with human Ephrin A5 for comparison. (PDB ID: 4BKA). There are three salt bridges and seven hydrogen bonds between the Ephrin protein and its receptor. Moreover, it can be clearly seen that the NTD loops host a large number of polar residues ( Figure 5) . These residues form a stable motif at temperatures 10-30 C, primarily due to the stability between the loops. At 40 C and 50 C, hydrophobic patch from N-terminal β strand is exposed towards the solvent. The polar residues from β -β1 and β1 -β1 move away from the N-terminal β strand and the β -β loop, reducing the possibility of protein-protein interaction. Hence, a strong possibility exists for the NTD to act as a protein binding site at lower temperature ranges. From the bioinformatics and structural analyses, we observed that the NTD not only acts as a glycan binding site but can also as a site for binding of several human proteins. The motif formed out of several polar residues on the solvent exposed loops at 1 -ould form salt-bridges and hydrogen bonds with partner proteins. At higher temperatures, the propensity of forming such interactions would be lost owing to the differential orientation of the loops. Nonetheless, the NTD could act as a possible target for development of vaccines and inhibitors. The receptor binding domain (RBD) of the Spike glycoprotein is a potential target for vaccine and drug development [36, 37] . It is highly conserved among the human coronaviruses and binds to ACE2 receptor present on the lung tissues [38] . Residues 458-506 of the RBD domain comprises of the receptor binding motif (RBM). The RBM has 8 residues which are identical and 14 conserved region primarily interacts with the ACE2 receptor and hence, often scientists target the RBD domain of for developing therapeutic agents [36, 37, and 39] . Earlier in Figure 2 , we saw that the RBD domain spanning from residues 333-680 shows higher stability when compared to the NTD of the S1 domain at different range of temperatures. We compared the time averaged conformation of the RBD generated from the last 10ns of the simulation time at different temperatures igure The ore β pleated sheet was very sta le demonstrating no lack of secondary structures at higher temperatures. However, the RBM motif (highlighted in magenta in Figure 6 ) shows a very dynamic conformation across different temperature ranges The dynami s was more pronoun ed at 1 C C and C whereas at C and C of temperature, the RBM had a more confined onformation The R flexi ility was more apparent at C and C where the three chains moved further away from each other. However, a tighter and well packed structure was found for the protein at 50 C. The figures suggest that although residue wise movements in RBD were not visible in RMSF (Figure 2 ), the RBD domains and motifs show intrinsic flexibility along particular temperature ranges. Previous studies have indicated that the RBD domain can adopt either an open or a closed conformation in the virus [18] . We compared the conformation of the Spikeprotein-ACE2 crystal structure and found that in the open conformation, the RBD exposes its RBM residues Phe456, Ala475, Phe486, Asn487, Tyr489, Gln493, Gly496, Gln498, Thr500, Asn501, Gly502 and Tyr505 to fa ilitate the inding of the re eptors t is fas inating to see that at C and more interestingly at C, the RBM motif is in a closed loop conformation and very compact which hinders its association with the partner proteins. 15 To validate the findings, we ran another simulation of the Spike protein at a higher temperature of 70 C. After 100ns of simulation, we found that significant similarity between the closed conformation observed at 50 C and the conformation at 70 C. The RBM residues, specifically Phe456, Ala475, Phe486, Asn487, Tyr489, Gln493, Gly496, Gln498, Thr500, Asn501, Gly502 and Tyr505 were found to be clearly buried between the interchain subunits at 70 C (Figure 7 , of the coronaviruses inside the host cells. It exists as a homotrimer and is partly exposed to the outer environment and partly immersed inside the lipid bilayer of the viral envelope. Here, we studied the differential response of the Spike protein at different temperature conditions. Our results show that the S2 transmembrane domain remains stable even without the bilayer membrane, whereas the solvent exposed S1 domain is quite flexible. Moreover, the S1 comprises of two subdomains, namely N-terminal domain (NTD) and the receptor binding domain (RBD). The simulations results show that the RBD is relatively less mobile. Its flexibility is limited only to the receptor binding motif or RBM which interacts with the Angiotensin Converting enzyme 2 (ACE2), its human receptor. However, the NTD was found to be quite mobile. Although, the NTD doesn't dire tly intera t with the re eptor in humans, it has been found to bind to receptors in other mammals [31] . The flexible NTD hosts a large number of charged residues on the top layer of its tri-layered beta sandwi h ar hite ture owever at -C of temperature, the polar residues were found to be less solvent exposed. The similarity of the NTD sequence with the several human receptors such as Ephrins, Briakunumab, anti-TSLP, etc indicated a possibility of the subdomain to be involved in binding to alternate human proteins. The RBM present on the RBD is very crucial in initial protein-protein interaction between the host and virus. We found that this domain is equilibration, by maintaining harmonic restraints on the protein heavy atoms, the system was gradually heated to 300K in a canonical ensemble. The harmonic restraints were gradually reduced to zero and solvent density was adjusted under isobaric and isothermal conditions at 1 atm and 300 K. This was followed by 500ps NVT and 500ps NPT equilibration with harmonic restraints of 1000 kJ mol -1 nm -2 on the heavy atoms. Production run for all the systems was carried out for 200ns till it reached a stable RMSD. All simulations were carried out in Gromacs 2020 with AMBERff99SB-ILDN forcefield for proteins [40, 41] . The long-range electrostatic interactions were treated by using Particle-Mesh Ewald sum and SHAKE was used to constrain all bonds involving hydrogen atoms. After equilibration, systems were heated or cooled at different temperatures (Table S1 ) and simulated for 200ns. All analyses were carried out using Gromacs analysis tools [40] . Protein Blast was used to search similar sequences in the human proteome. The Blast Tree View widget helped us generate the phylogenetic tree which is a simple distance based clustering of the sequences based on pairwise alignment results of Blast relative to the query sequence [42] . VMD was used for visualization of results and generation of figures [43] . Supporting figures, Figs S1-S6 and Table S1 are provided online. A SARS-CoV-21 2 protein interaction map reveals targets for drug repurposing COVID-19: A novel zoonotic disease caused by a coronavirus from hina: hat we know and what we don't i ro iol ust From SARS to MERS: evidence and speculation Genetic Recombination, and Pathogenesis of Coronaviruses The effects of temperature and relative humidity on the viability of the SARS coronavirus Stability of SARS-CoV-2 in different environmental conditions Effects of air temperature and relative humidity on coronavirus survival on surfaces Aerosol and surface stability of SARS-CoV-2 as compared with 22 SARS-CoV-1 Influence of meteorological factors and air pollution on the outbreak of severe acute respiratory syndrome An initial investigation of the association between the SARS outbreak and weather: With the view of the environmental temperature and its variation Temperature-dependent innate defense against the common cold virus limits viral replication at warm temperature in mouse airway cells Roles of Humidity and Temperature in Shaping Influenza Seasonality Temperature-sensitive viral infection: Inhibition of hemagglutinating virus of Japan (Sendai virus) infection at 41° Highly heterogeneous temperature sensitivity of 2009 pandemic influenza A(H1N1) viral isolates Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): An overview of viral structure and host response Structure, Function, and Evolution of Coronavirus Spike Proteins Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Activation of the SARS coronavirus spike protein via sequential proteolytic cleavage at two distinct sites Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Thermal Stability of Globins: Implications of Flexibility and Heme Coordination Studied by Molecular Dynamics Simulations Structural flexibility and protein adaptation to temperature: Molecular dynamics analysis of malate dehydrogenases of marine molluscs Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries Will coronavirus pandemic diminish by summer? How significant is a protein structure similarity with TM-score = 0.5? The Protein Data Bank Nuc Cryo-electron microscopy structure of a coronavirus spike glycoprotein trimer Identification of the Membrane-Active Regions of the Severe Acute Respiratory Syndrome Coronavirus Spike Membrane Glycoprotein Using a 16/18-Mer Peptide Scan: Implications for the Viral Fusion Mechanism J Structure of mouse coronavirus spike protein complexed with receptor reveals mechanism for viral entry Purified coronavirus spike protein nanoparticles induce coronavirus neutralizing antibodies in mice The recombinant N-terminal domain of spike proteins is a potential vaccine against Middle East respiratory syndrome coronavirus (MERS-CoV) infection Immunogenicity and protection efficacy of monomeric and trimeric recombinant SARS coronavirus spike protein subunit vaccine candidates COVID-19 and Multiorgan Response Receptor-binding domain of SARS-CoV spike protein induces highly potent neutralizing antibodies: Implication for 26 developing subunit vaccine Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine Structure, Function, and Evolution of Coronavirus Spike Proteins Receptor Recognition Mechanisms of Coronaviruses: a Decade of Structural Studies GROMACS development team Improved side-chain torsion potentials for the Amber ff99SB protein force field Database resources of the National Center for Biotechnology Information