key: cord-310230-9wfb43gt authors: Ghorbani, Mahdi; Brooks, Bernard R.; Klauda, Jeffery B. title: Critical Sequence Hot-spots for Binding of nCOV-2019 to ACE2 as Evaluated by Molecular Simulations date: 2020-06-27 journal: bioRxiv DOI: 10.1101/2020.06.27.175448 sha: doc_id: 310230 cord_uid: 9wfb43gt The novel coronavirus (nCOV-2019) outbreak has put the world on edge, causing millions of cases and hundreds of thousands of deaths all around the world, as of June 2020, let alone the societal and economic impacts of the crisis. The spike protein of nCOV-2019 resides on the virion’s surface mediating coronavirus entry into host cells by binding its receptor binding domain (RBD) to the host cell surface receptor protein, angiotensin converter enzyme (ACE2). Our goal is to provide a detailed structural mechanism of how nCOV-2019 recognizes and establishes contacts with ACE2 and its difference with an earlier coronavirus SARS-COV in 2002 via extensive molecular dynamics (MD) simulations. Numerous mutations have been identified in the RBD of nCOV-2019 strains isolated from humans in different parts of the world. In this study, we investigated the effect of these mutations as well as other Ala-scanning mutations on the stability of RBD/ACE2 complex. It is found that most of the naturally-occurring mutations to the RBD either strengthen or have the same binding affinity to ACE2 as the wild-type nCOV-2019. This may have implications for high human-to-human transmission of coronavirus in regions where these mutations have been found as well as any vaccine design endeavors since these mutations could act as antibody escape mutants. Furthermore, in-silico Ala-scanning and long-timescale MD simulations, highlight the crucial role of the residues at the interface of RBD and ACE2 that may be used as potential pharmacophores for any drug development endeavors. From an evolutional perspective, this study also identifies how the virus has evolved from its predecessor SARS-COV and how it could further evolve to become more infectious. The novel coronavirus (nCOV-2019) outbreak emerging from China has become a global pandemic and a major threat for human public health. According to World Health Organization (WHO) as of June 27 th 2020, there has been about 10 million confirmed cases and approaching 500,000 deaths due to coronavirus in the world. [1] [2] Much of the human population including the United States of America were under lockdown or official stay-at-home orders to minimize the continued spread of the virus. Coronaviruses are a family of single-stranded enveloped RNA viruses. Phylogenetic analysis of coronavirus genome has shown that nCOV-2019 belongs to the beta-coronavirus family, which also includes MERS-COV, SARS-COV and bat-SARS-related coronaviruses. [3] [4] It is worth mentioning that SARS-COV, which was widespread in 2002 caused more than 8,000 cases and about 800 deaths and MERS-COV (middle east respiratory syndrome coronavirus) in 2012 also spread in more than 25 countries, causing about 2,500 cases and more than 850 deaths. (www.who.int/health-topics/coronavirus). 5 In all coronaviruses, a homotrimeric spike glycoprotein on the virion's envelope mediates coronavirus entry into host cells through a mechanism of receptor binding followed by fusion of viral and host membranes. 3, 6 Coronavirus spike protein contains two functional subunits S1 and S2. The S1 subunit is responsible for binding to host cell receptor, and the S2 subunit is responsible for fusion of viral and host cell membranes. 3, 7 The spike protein in nCOV-2019 exists in a meta-stable pre-fusion conformation that undergoes a substantial conformational rearrangement to fuse the viral membrane with the host cell membrane. 7, 8 nCOV-2019 is closely related to bat coronavirus RaTG13 with about 93.1% sequence similarity in the spike protein gene. The sequence similarity of nCOV-2019 and SARS-COV is less than 80% in the spike genome. 2 S1 subunit in the spike protein includes a receptor binding domain (RBD) that recognizes and binds to the host cells receptor. The RBD of nCOV-2019 shares 72.8% sequence identity to SARS-COV RBD and the root mean squared deviation (RMSD) for the structure between the two proteins is 1 . which shows the high structural similarity. 4, 8, 9 Experimental binding affinity measurements using Surface Plasmon Resonance (SPR) have shown that nCOV-3 fold higher affinity than SARS-COV binding to ACE2. 7 Based on the sequence similarity between RBD of nCOV-2019 and SARS-COV and also the tight binding between RBD of nCOV-2019 and ACE2, it is most probable that nCOV-2019 uses this receptor on human cells to gain entry into the body. 3, 6, 7, 10 The spike protein and specifically the RBD domain in coronaviruses have been a major target for therapeutic antibodies. However, no monoclonal antibodies targeted to RBD have been able to bind efficiently and neutralize nCOV-2019. 7, 11 The core of nCOV-2019 RBD is a 5- The sequence alignment between SARS-COV in human, SARS civet, Bat RaTG13 coronavirus and nCOV-2019 in the RBM is shown in Figure 2 . There is a 50% sequence similarity between the RBM of nCOV-2019 and SARS-COV. RBM mutations played an important role in the SARS epidemic in 2002. 12, 3 Two mutations in the RBM of SARS-2002 from SARS-Civet were observed from strains of these viruses. These two mutations were K479N and S487T. These two residues are close to the virus binding hotspots in ACE2 including hotspot-31 and hotspot-353. Hotspot-31 centers on the salt-bridge between K31-E35 and hotspot-353 is centered on the salt-bridge between K353-E358 on ACE2. Residues K479 and S487 in SARS-Civet are in close proximity with these hotspots and mutations at these residues caused SARS to bind ACE2 with significantly higher affinity than SARS-Civet and played a major role in civet-to-human and human-to-human transmission of SARS coronavirus in 2002. 3, [13] [14] [15] Numerous mutations in the interface of SARS-COV RBD and ACE2 from different strains of SARS isolated from humans in 2002 have been identified and the effect of these mutations on binding ACE2 have been investigated by surface plasmon resonance. 16, 17 Two identified RBD mutations (Y442F and L472F) increased the binding affinity of SARS-COV to ACE2 and two mutations (N479K, T487S) decreased the binding affinity. It was demonstrated that these mutations were viral adaptations to either human or civet ACE2. 16, 17 A pseudotyped viral infection assay of the interaction between different spike proteins and ACE2 confirmed the correlation between high affinity mutants and their high infection. 16 Further investigation of RBD residues in binding of SARS-COV and ACE2 was performed through ala-scanning mutagenesis, which resulted in identification of 10 residues that reduce binding affinity to ACE2 upon mutation to alanine. These residues are K390, R429, D429, D454, I455, N473, F483, Q492, Y494 and R495. 18 RBD mutations have also been identified in MERS-COV, which affected their affinity to receptor (DPP4) on human cells. 17 It is not known whether these mutations are linked to the severity of coronavirus in these regions. The focus of this article is to elucidate the differences between the interface of SARS-COV and nCOV-2019 with ACE2 to understand with atomic resolution the interaction mechanism and hotspot residues at the RBD/ACE2 interface using long-timescale molecular dynamics (MD) simulation. An alanine-scanning mutagenesis in the RBM of nCOV-2019 helped to identify the key residues in the interaction, which could be used as potential pharmacophores for future drug development. Furthermore, we performed molecular simulations on the seven most common mutations found from surveillance of RBD mutations N439K, T478I, V483A, G476S, S494P, V483F and A475V. From an evolutionary perspective this study shows the residues in which the virus might further evolve to be even more dangerous to human health. nCOV-2019 shares 76% sequence similarity with SARS-2002 spike protein, 73% sequence identity for RBD and 50% for the RBM. 1 Bat coronavirus RaTG13 seems to be the closest relative of nCOV-2019 sharing about 93% sequence identity in the spike protein. 6 The 25 The mutations selected are listed in Table S1 along with their location in RBD. The crystal structure of nCOV-2019 in complex with hACE2 (pdb id:6M0J) 28 as well as SARS-COV complex with human ACE2 (pdb id: 6ACJ) 29 were obtained from RCSB (www.rcsb.org). 30 The 5000 steps of energy minimizations were done using the steepest descent algorithm. In all steps the LINCS algorithm was used to constraint all bonds containing hydrogen atoms and a time step of 2 fs was used as the integration time step. 33 Equilibration of all systems were performed in three steps. In the first step, 100,000 steps of simulation were performed using a velocity-rescaling thermostat to maintain the temperature at 310K with a 0.1 ps coupling constant in NVT ensemble under periodic boundary conditions and harmonic restraints on the backbone and sidechain atoms of the complex. 34 The velocity rescaling thermostat was used in all other steps of simulation. In the next step, we performed 300,000 steps in the isothermalisobaric NPT ensemble at temperature of 310K and pressure of 1bar using a Berendsen barostat. 35 This was done by decreasing the force constant of the restraint on backbone and side chain atoms of the complex from 1000 to 100 and finally to 10 మ . Berendsen barostat was only used for the equilibration step due to usefulness in rapidly correcting density. In the next step the restraints were removed, and the systems were subjected to 1,000,000 steps of MD simulation under NPT ensemble. In the production run, harmonic restraints were removed and all the systems were simulated using a NPT ensemble where the pressure was maintained at 1bar using the Parrinello-Rahman barostat 36 with a compressibility of and a coupling constant of 0.5ps. It is important to note that all the Berendsen barostat was only used for the equilibration step as it was shown that this barostat can cause unrealistic temperature gradients. 37 The production run lasted for 500ns for SARS-COV and nCOV-2019 complexes and 300ns for all the mutants using with a 2fs timestep and the particle-mesh Ewald (PME) 36 for long range electrostatic interactions using GROMACS 2018.3 package. 38 All mutant systems were constructed as described before and all complexes ran for 300ns of production run. The principal components were used to calculate and plot the approximate free energy landscape (aFEL). We refer to the free energy landscape produced by this approach to be approximate in that the ensemble with respect to the first few PC's (lowest frequency quasiharmonic modes) is not close to convergence, but the analysis can still provide valuable information and insight. g_sham, g_covar and g_anaeig functions in GROMACS 41 were used to obtain principal components and aFEL. In each aFEL the deep valleys represent the most stable conformations separated by some intermediate states. The dynamic cross-correlation maps (DCCM) were obtained using MD_TASK package to identify the correlated motions of RBD residues. 42 In DCCM the cross-correlation matrix by setting a value of 80 and 2 for solvent and solute dielectric constants. The non-polar free energy is simply estimated from solvent accessible surface area (SASA) of the solute from equation 5. To compute the RMSD of systems, the rotational and translational movements were removed by first fitting the C α atoms of the RBD to the crystal structure and then computing the RMSD with respect to C α atoms of RBD in each system. In most of the variants, the RMSD is stable during the 300ns simulation. However, a few mutations show some RMSD variance. In mutation Y489A, the RMSD increases from The first two eigenvectors were used to calculate and plot the aFEL as a function of first two principal components using the last 200ns of the simulation for mutant systems. aFEL for a few mutants are shown in Figure 6 and the rest of them are shown in Figure S4 . The binding energetics between ACE2 and the RBD of SARS-COV, nCOV-2019 and all its mutant complexes were investigated by the MMPBSA method. 44 for nCOV-2019. The binding free energy for nCOV-2019 and SARS-COV was decomposed into a perresidue based binding affinity to find the residues that contribute strongly to the binding and complex formation (Figure 9 ). Most of the investigated residues in the RBM of nCOV-2019 had a favorable contribution to total binding energy. Binding free energy decomposition to its individual components for all mutants is represented in contributes the most to this low binding energies for these mutants. The contribution of RBM residues to binding with ACE2 for nCOV-2019 were mapped to the RBD structure and is shown in Figure 11B . Natural mutants exhibited similar or higher binding affinities compared to wild-type nCOV-2019. Importantly, mutation T478I which is one of the most occurred mutations in England based on GISAID database 25 Table 1 contribution of interface residues to structure in RBD of nCOV-2019. The RBD domain is purple and the ACE2 is yellow. The RBD in contact with AC2 is rendered in a surface format with more red being a favorable contribution to binding (more negative) and blue unfavorable contribution (positive). In this work, we preformed MD simulations to unveil the detailed molecular mechanism . D460 in SARS-COV is located in a region of high negative charge from residues E35, E37 and D38 on ACE2. Electrostatic repulsion between D480 on SARS-COV and the acidic residues on ACE2 is the reason for highly negative contribution of this residue to binding of SARS-COV to ACE2. Mutation to S494 in this location removes this highly negative contribution. To our knowledge this is first detailed molecular simulation study on the effect of mutations on binding of nCOV-2019 to ACE2. Previous computational studies have found that nCOV-2019 binds to ACE2 with a total binding affinity which was about 3 0 stronger than SARS-COV and is in fair agreement with the results here. 52 The critical role of interface residues and residues are computationally investigated here and in other articles and the results of all the studies indicate the importance of these residues for the stability of the complex and finding hotspot residues for the interaction with receptor ACE2. 47, [52] [53] [54] It is interesting to note the role of shown there is a correlation between higher binding affinity to receptor and higher infection rate by coronavirus. [55] [56] [57] [58] High binding affinity for some mutants such as T478I could be the reason for higher human-to-human transmission rate in regions where these mutations are found. It is also alerting that mutations at other residues such as G446A, G449A, Y449A and Y489A increase the binding affinity considerably and should be monitored. Mutations of nCOV-2019 RBD that do not change the binding affinity and complex stability, could have implications for antibody design purposes since they could act antibody escape mutants. Escape from monoclonal antibodies are observed for mutations of SARS-COV in 2002 and these mutations should be considered for any antibody design endeavors against consider these escape mutations. In conclusion, this study unraveled key molecular traits underlying the higher affinity of nCOV-2019 for ACE2 compared to SARS-COV and unveiled critical residues for the interaction higher affinity than wild-type. Other occurring mutations N439K and E484A are found to increase the electrostatic interaction of RBD with ACE2. It is also alerting that some of the alanine substitutions at residues G446, G447 and Y489 substantially increased the binding affinity that may lead to a strongly RBD attachment to ACE2 and influence the infection virulence. On the other hand, most mutations are found not to impact the binding affinity of RBD with ACE2 in nCOV-2019 which could have implications for vaccine design endeavors as these mutations could act as antibody escape mutants. Receptor recognition is the first line of attack for coronavirus and this study gives novel insights to key structural features of interface residues for advancement of effective therapeutic strategies to stop the coronavirus pandemic. The authors would like to dedicate this article to the doctors and nurses who sacrificed their time, health and even their lives to fight COVID-19, particularly those in Iran and the United States. JBK would also like to dedicate this work to family friend Joe Kaplan (Silver Spring, MD) who passed away due to COVID-19 on April 22, 2020. A Novel Coronavirus from Patients with Pneumonia in China A Pneumonia Outbreak Associated with a New Coronavirus of Probable Bat Origin Structure, Function, and Evolution of Coronavirus Spike Proteins Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Receptor Recognition by the Novel Coronavirus from Wuhan: An Analysis Based on Decade-Long Structural Studies of SARS Coronavirus Cryo-EM Structure of the 2019-NCoV Spike in the Prefusion Conformation. Science The Coronavirus Spike Protein Is a Class I Virus Fusion Protein: Structural and Functional Characterization of the Fusion Core Complex Structural Insights into the Middle East Respiratory Syndrome Coronavirus 4a Protein and Its DsRNA Binding Mechanism Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Potent Binding of 2019 Novel Coronavirus Spike Protein by a SARS Coronavirus-Specific Human Monoclonal Antibody Receptor Recognition Mechanisms of Coronaviruses: A Decade of Structural Studies Receptor and Viral Determinants of SARS-Coronavirus Adaptation to Human ACE2 Structural Analysis of Major Species Barriers between Humans and Palm Civets for Severe Acute Respiratory Syndrome Coronavirus Infections Receptor Recognition and Cross-Species Infections of SARS Coronavirus Mechanisms of Host Receptor Adaptation by Severe Acute Respiratory Syndrome Coronavirus Structural Analysis of Major Species Barriers between Humans and Palm Civets for Severe Acute Respiratory Syndrome Coronavirus Infections The SARS Coronavirus S Glycoprotein Receptor Binding Domain: Fine Mapping and Functional Characterization Development and Characterization of a Severe Acute Respiratory Syndrome-Associated Coronavirus-Neutralizing Human Monoclonal Antibody That Provides Effective Immunoprophylaxis in Mice Neutralizing Human Monoclonal Antibodies to Severe Acute Respiratory Syndrome Coronavirus: Target, Mechanism of Action, and Therapeutic Potential Human Monoclonal Antibody Combination against SARS Coronavirus: Synergy and Coverage of Escape Mutants Vaccine Efficacy in Senescent Mice Challenged with Recombinant SARS-CoV Bearing Epidemic and Zoonotic Spike Variants Structural Basis for Potent Cross-Neutralizing Human Monoclonal Antibody Protection against Lethal Human and Zoonotic Severe Acute Respiratory Syndrome Coronavirus Challenge Escape from Human Monoclonal Antibody Neutralization Affects In Vitro and In Vivo Fitness of Severe Acute Respiratory Syndrome Coronavirus Disease and Diplomacy: GISAID's Innovative Contribution to Global Health Structural Basis for Receptor Recognition by the Novel Coronavirus from Wuhan MCSM-PPI2: predicting the effects of mutations on protein The SARS Coronavirus S Glycoprotein Receptor Binding Domain: Fine Mapping and Functional Characterization Structure of the SARS-CoV-2 Spike Receptor-Binding Domain Bound to the ACE2 Receptor The Protein Data Bank Gromacs: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers Improved Side-Chain Torsion Potentials for the Amber Ff99SB Protein Force Field Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Chem. Phys Journal of Computational Chemistry Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method Constant Pressure Molecular Dynamics Simulation: The Langevin Piston Method ARTICLES YOU MAY BE INTERESTED IN Harmonic Analysis of Large Systems. I. Methodology Principal Component Analysis: A Method for Determining the Essential Dynamics of Proteins Gromacs: High Performance Molecular Simulations through Multi-Level Parallelism from Laptops to Supercomputers A software suite for analyzing molecular dynamics trajectories Assessing the Performance of the G_mmpbsa Tools to Simulate the Inhibition of Oseltamivir to Influenza Virus Neuraminidase by Molecular Mechanics Poisson-Boltzmann Surface Area Methods Recent Developments and Applications of the MMPBSA Method Virtual Screening Using Molecular Simulations Insight into the Interactive Residues between Two Domains of Human Somatic Angiotensin-Converting Enzyme and Angiotensin II by MM-PBSA Calculation and Steered Molecular Dynamics Simulation Enhanced Receptor Binding of SARS-CoV-2 through Networks of Hydrogen-Bonding and Hydrophobic Interactions Is the Rigidity of SARS-CoV-2 Spike Receptor-Binding Motif the Hallmark for Its Enhanced Infectivity? Insights from All-Atom Simulations Characterization of the Receptor-Binding Domain (RBD) of 2019 Novel Coronavirus: Implication for Development of RBD Protein as a Viral Attachment Inhibitor and Vaccine Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Phylogenetic Network Analysis of SARS-CoV-2 Genomes Is the Rigidity of SARS-CoV-2 Spike Receptor-Binding Motif the Hallmark for Its Enhanced Infectivity? An Answer from All-Atoms Simulations The SARS-CoV-2 Exerts a Distinctive Strategy for Interacting with the ACE2 Human Receptor Computational Simulations Reveal the Binding Dynamics between Human ACE2 and the Receptor Binding Domain of SARS-CoV-2 Spike Protein Receptor Recognition by the Novel Coronavirus from Wuhan: An Analysis Based on Decade-Long Structural Studies of SARS Coronavirus Receptor and Viral Determinants of SARS-Coronavirus Adaptation to Human ACE2 Efficient Replication of Severe Acute Respiratory Syndrome Coronavirus in Mouse Cells Is Limited by Murine Angiotensin-Converting Enzyme 2 SARS-CoV-2 Spike Glycoprotein Receptor Binding Domain Is Subject to Negative Selection with Predicted Positive Selection Mutations