key: cord-326441-w8iyh59u authors: Bhattacharjee, Sayan; Bhattacharyya, Rajanya; Sengupta, Jayati title: Transmission of allosteric response within the homotrimer of SARS-CoV-2 spike upon recognition of ACE2 receptor by the receptor-binding domain date: 2020-09-06 journal: bioRxiv DOI: 10.1101/2020.09.06.284901 sha: doc_id: 326441 cord_uid: w8iyh59u The pathogenesis of novel SARS-CoV-2 virus initiates through recognition of ACE2 receptor (Angiotensin-converting enzyme 2) of the host cells by the receptor-binding domain (RBD) located at spikes of the virus. Following receptor-recognition, proteolytic cleavage between S1 and S2 subunits of the spike protein occurs with subsequent release of fusion peptide. Here, we report our study on allosteric communication within RBD that propagates the signal from ACE2-binding site towards allosteric site for the post-binding activation of proteolytic cleavage. Using MD simulations, we have demonstrated allosteric crosstalk within RBD in apo- and receptor-bound states where dynamic correlated motions and electrostatic energy perturbations contribute. While allostery, based on correlated motions, dominates inherent distal communication in apo-RBD, electrostatic energy perturbations determine favorable crosstalk within RBD upon binding to ACE2. Notably, allosteric path is constituted with evolutionarily conserved residues pointing towards their biological relevance. As revealed from recent structures, in the trimeric arrangement of spike, RBD of one copy interacts with S2 domain of another copy. Interestingly, the allosteric site identified is in direct contact (H-bonded) with a region in RBD that corresponds to the interacting region of RBD of one copy with S2 of another copy in trimeric constitution. Apparently, inter-monomer allosteric communication orchestrates concerted action of the trimer. Based on our results, we propose the allosteric loop of RBD as a potential drug target. The recent outbreak of a novel coronavirus, the SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), a close relative of the previously-known SARS-CoVs, has become an ongoing global public-health threat. Following its emergence, it has created an pandemic situation within no time owing to its alarming high rate of transmission 1 . Cell entry in particular, is an important early step of cross-species transmission. Spike glycoprotein of CoVs plays the role of attachment to the Angiotensin-converting enzyme-2 (ACE2) receptor of the host cells and mediates viral entry. The association with the host-cell receptor is regulated by the receptor-binding domain (RBD) of the spike protein followed by cleavage using a host protease, which releases the spike fusion peptide, and thereby promotes viral entry 2 . SARS-CoV-2 is a single-stranded RNA virus with four major constituent proteins: spike protein (S), membrane protein (M), envelope protein (E), and nucleo-capsid protein (N) 3 . Lung is the key target organ of the virus and, as the name implies, the infection induces pneumonia accompanied by acute respiratory distress. Recent high resolution structures of the trans-membrane spike protein of SARS CoV-2 show 4 intertwined arrangement of three monomers. Each monomer consists of sub-domains S1 and S2. The receptor-binding domain (RBD) that interacts with ACE2 receptor of the host belongs to the S1 subunit, while the S2 subunit anchors on the viral membrane. Interestingly, RBD of one copy directly interacts with the S2 domain of another copy in the homotrimer ( Figure 1 ). However, the receptor binding motif (RBM) in the RBD that interacts with the receptor has not been fully resolved in the recent spike trimer structures. A following crystallographic study 5 has resolved the interactions of SARS-CoV-2 spike RBD with the ACE2 receptor. Clearly, successful viral infection is reliant on the effective complex formation between the host receptor and the protruded spikes on the viral surface, which makes it an obvious drug target. However, in addition to the contact residues, studies have indicated potential role of noncontact residues and allostery in the spike protein-receptor interactions 6 . To better understand the etiology of the infection, we have focused on the RBD of the spike protein and its binding to the ACE2 receptor 6a, 7 , information on which may provide important clues for designing drugs to combat the virus. Our study aims to understand the hidden allosteric cross-talks in RBD upon recognition of the ACE2 receptor and their possible role in the activity of the spike protein trimer. The allosteric crosstalk between the ACE2-binding residues of RBD with its distal residues (away from the binding site) was evaluated using MD simulation followed by extensive analyses of both dynamics and energy terms at atomic level. MD simulation has long been implicated as a suitable procedure to explore the allosteric paths of information exchange in between amino acid residues of a protein by elucidating intermediate protein conformations that often occur in pico or nano second time scale which are often difficult to capture with other techniques 8 . The conformational dynamics and change in energetic within RBD before and after binding to the host receptor revealed the hidden communication that shapes the crosstalk pathway leading to recognition and subsequent internalization of the virus. Our study offers the identified allosteric sites as additional, potential drug targets. Insights gained from our results on electrostatic contributions in viral infection suggest that modification or inhibition of the allosteric communication would block proper functioning of the viral spike protein and consequently hinder viral infection. Therefore, it may be considered as possible intervention pathway for infection prevention. Conformational dynamics of free and receptor-bound RBD. To probe deeper into the dynamicity of free and ACE2-bound RBD of the COVID spike protein, Cα atoms based root mean square fluctuations (RMSF) and Order parameter (S 2 ) considering the N-H vectors 9 of RBD, were calculated from the corresponding trajectories. The overall bound state RMSF revealed a lower value relative to the free RBD, implicating the higher rigidity of RBD after binding to ACE2 compared to its apo-form. This observation suggests the ACE2-directed stabilization of the RBD in the bound state by selection of particular conformation from the fast time scale (ps-ns) regime. In this regard, the residues exhibiting remarkably differentiated RMSF values between the two states are clearly regions displaying some motifs having higher plasticity in free RBD (an RMSF cut-off of >0.25 nm was set to define regions of high flexibility): the binding loop (containing residues 474 to 486), and a distal loop (consisting residues 358 to 376). It is apparent that the distal loop can be considered as an allosteric site (away from the ACE2 binding site of RBD). Remarkably, this allosteric loop is in direct contact through H-bond to a region of the RBD that coincides with the region of one monomer that interacts with the S2 domain of another monomeric copy in the spike protein trimer 4a, 10 ( Figure 1 ). Significantly, when the dynamicity of the apo-form was compared with respect to the ACE2bound RBD, the fluctuation of the allosteric loop decreased considerably along with the ACE2 binding loop (Figure 2A respectively. Thus, this observation also suggests the decrement in overall RBD backbone entropy factor (S 2 is directly related to conformational entropy 12 ) and hence formation of a stable complex. Allosteric involvement in proteins is one of the most important areas in the field of allosteric drug development and it was proposed that allosteric communications were achieved either by correlated motions of the amino acids or by fundamental non-bonded energy contributions 11, 13 . A number of noticeable differences in correlation were found when DCCM (a residue-wise correlation matrix-based approach where the correlated and anti-correlated motions are scored between 1 and -1 considering strong: |Cij| = 1.0-0.7; moderate: |Cij| = 0.7-0.5; and weak: |Cij| = 0.3 -0.5) of both apo-and ACE2-bound RBD was compared. Free RBD consisted of several strongly correlated and moderately anti-correlated motions within the amino acids, which were either missing or became weakly correlated in the ACE2-bound RBD ( Figure 3A -B). Interestingly, the binding loop (residues from 474 to 486) and the allosteric loop (residues from 358 to 376) showed a moderate anti-correlated motion. In other words, these loops move away from each other in a correlated manner. It may be noted here, both the allosteric loops are in correlation which reestablishes the presence of H-bond between them ( Figure 1B ) as revealed by crystal structure described above. This kind of anti-correlated motion, which could be attributed to the structural flexibility of apo-RBD, is much weakened in the ACE2-bound RBD owing to its conformational rigidity in bound state. Cross-correlation maps were used to identify the regions that move in or out of phase during the simulations 11 . The elements of the matrix (Cij) were obtained from their position vector (r) as shown in Eq. 1: Where i and j correspond to any two atoms, residues, or domains; ri and rj are position vectors of i and j; and the angle brackets denote an ensemble average. Inter-atomic cross-correlation fluctuations between any two pairs of atoms (or residues) can be calculated by using this expression and can be represented graphically by the DCCM. The value of Cij can vary from -1 (strongly anti-correlated motion) to +1 (strongly correlated motion). Backbone N-H vectors were selected to calculate S 2 over the period of trajectory, which represent dynamicity of protein, with a value 1 indicating complete rigidity and a value towards 0 represents enhanced dynamicity. whereμ1, μ2, and μ3 are the x, y, and z components of the relevant bond vector scaled to unit magnitude, μ, respectively 9 . Angular brackets indicate averaging over the snapshots. The residue wise non-bonded interaction energy between free RBD and its bound state with ACE2 was described as: The non-bonded interactions (∆ RBD−ACE2 ) include both electrostatic (∆ RBD−ACE2 ) and van der Waals (∆ RBD−ACE2 ) interactions and were modeled using a Coulomb and Lennard-Jones (LJ) potential function, respectively. Hierarchical clustering was implemented for each correlation network to produce cumulative nodal clusters, or communities those are strongly intra-connectedyet slightly inter-connected, using a related betweenness clustering algorithm to that developed by Girvan.Nevertheless, as is standard for unweighted networks, rather than selecting the partitions with the highest modularity ranking, we selected the partition nearest to the highest modularity score that consisted of the smallest number of overall communities. This excluded the typical circumstance where several small communities with modularity values of similarly high scores were created. The density of connections per node which was assessed by the node centralities were calculated as described by the equation: Where represents the centrality of the node i, is the ijth entry of the adjacent matrix A, is actually a constant, and G represents all nodes. ≠0 if node i and j are linked, and it will be equal to − , where wij is the edge weight. For every i (i G) is equivalent to defining the eigenvalues and eigenvectors of matrix A. Considering a pair of nodes optimal (shortest) and suboptimal (near however farther than optimized neo) linking node paths were established employing the previously mentioned algorithm 24 . Path analysis from electrostatic energy contributions. The perturbation in pair-wise electrostatic interactions∆ , − : The contributions due to LJ and electrostatic (Coulomb) non-bonded interactions to nonbonded energy,were calculated separately, but the LJ terms were found to be numerically much smaller than the respective electrostatic ones, so we have focused on the electrostatic interactions. The interaction energy between two residues i and j is the sum of the non-bonded interaction energies already defined in a force-field where: Energy networks were built by considering the amino-acid residues as nodes. A weighted edge was made between any pair of residues i and j by considering the interaction energy as the weight. Energy-Hubs are defined as nodes that have a higher degree or connectivity in the network. Betweenness-Centrality was computed using the following equation: Where BC(v) is betweenness-centrality of residue ν.n is the number of residues within network, σij(v) is the number of shortest paths between residue i and j that pass through residue ν and σij is the total number of shortest paths from i to j. Dijkstra's algorithm 25 was applied to calculate the shortest path between residues i and j. Binding free energy. MMPB-SA tool for GROMACS was used to calculate the binding free energy between ACE2 and RBD 26 . SB and JS conceived the project. SB designed and performed all the computational work. SB, RB and JS analyzed the data and wrote the paper. This work was supported by SERB, DST (India) sponsored project, and CSIR-Indian Institute of Chemical Biology, Kolkata, India. We acknowledge the Central Instrument Facility (CIF) of CSIR-Indian Institute of Chemical Biology, and CSIR-4Pi for super computer facility for supporting our computational work. SB and RB acknowledge UGC and CSIR, India, respectively for awarding senior research fellowship. Pandemic potential of 2019-nCoV Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses A pneumonia outbreak associated with a new coronavirus of probable bat origin The role of molecular simulations in the development of inhibitors of amyloid beta-peptide aggregation for the treatment of Alzheimer's disease Protein side-chain dynamics and residual conformational entropy Proteolytic activation of the SARS-coronavirus spike protein: cutting enzymes at the cutting edge of antiviral research A study of communication pathways in methionyl-tRNA synthetase by molecular dynamics simulations and structure network analysis On the relationship between NMRderived amide order parameters and protein backbone entropy changes Interaction energy based protein structure networks Structure analysis of the receptor binding of 2019-nCoV. Biochemical and biophysical research Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Improved side-chain torsion potentials for the Amber ff99SB protein force field A modified TIP3P water potential for simulation with Ewald summation Combining conformational flexibility and continuum electrostatics for calculating pK(a)s in proteins PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical pKa Predictions Canonical sampling through velocity rescaling. The Journal of chemical physics Polymorphic transitions in single crystals: A new molecular dynamics method A smooth particle mesh Ewald method Finding the K Shortest Loopless Paths in a Network A note on two problems in connexion with graphs g_mmpbsa--a GROMACS tool for high-throughput MM-PBSA calculations The authors declare no conflict of interest.