key: cord-0842074-hmzf94fr authors: Nguyen, Kien; Chakraborty, Srirupa; Mansbach, Rachael A.; Korber, Bette; Gnanakaran, S. title: Exploring the role of glycans in the interaction of SARS-CoV-2 RBD and human receptor ACE2 date: 2021-03-31 journal: bioRxiv DOI: 10.1101/2021.03.30.437783 sha: 539cd941a8455bedae21d155d5137b502a7eb397 doc_id: 842074 cord_uid: hmzf94fr COVID-19 is a highly infectious respiratory disease caused by the novel coronavirus SARS-CoV-2. It has become a global pandemic and its frequent mutations may pose new challenges for vaccine design. During viral infection, the Spike RBD of SARS-CoV-2 binds the human host cell receptor ACE2, enabling the virus to enter the host cell. Both the Spike and ACE2 are densely glycosylated, and it is unclear how distinctive glycan types may modulate the interaction of RBD and ACE2. Detailed understanding of these determinants is key for the development of novel therapeutic strategies. To this end, we perform extensive all-atom simulations of the (i) RBD-ACE2 complex without glycans, (ii) RBD-ACE2 with oligomannose MAN9 glycans in ACE2, and (iii) RBD-ACE2 with complex FA2 glycans in ACE2. These simulations identify the key residues at the RBD-ACE2 interface that form contacts with higher probabilities, thus providing a quantitative evaluation that complements recent structural studies. Notably, we find that this RBD-ACE2 contact signature is not altered by the presence of different glycoforms, suggesting that RBD-ACE2 interaction is robust. Applying our simulated results, we illustrate how the recently prevalent N501Y mutation may alter specific interactions with host ACE2 that facilitate the virus-host binding. Furthermore, our simulations reveal how the glycan on Asn90 of ACE2 can play a distinct role in the binding and unbinding of RBD. Finally, an energetics analysis shows that MAN9 glycans on ACE2 decrease RBD-ACE2 affinity, while FA2 glycans lead to enhanced binding of the complex. Together, our results provide a more comprehensive picture of the detailed interplay between virus and human receptor, which is much needed for the discovery of effective treatments that aim at modulating the physical-chemical properties of this virus. To discuss our results in view of recent experimental structures, we used the contact definition 130 described above and identified all RBD-ACE2 contacts that are present in PDB IDs: 6M0J [5] , 131 6M17 [6] , and 6VW1 [7] . We find that these structures together implicate a total of 42 con- (Table S1) . Notably, all 25 persistent contacts captured by the simulations 136 represent a subset of those 42 experimentally-reported interactions. Thus, our analysis demon-137 strates that while recent structures have revealed the possible interactions, the present calculations 138 quantify their relative involvement, which can aid in estimating the potential impact of mutations. 139 We note that previous shorter simulations (∼ 5 µs) [8] implicated a significantly higher number of 140 persistent contacts than identified by our long simulations (∼ 225µs). Such discrepancy highlights 141 the importance of extensive sampling as performed here to obtain accurate statistics. Using large 142 sets of simulation data, we have provided a more precise evaluation of RBD-ACE2 contacts, and 143 have additionally shown that these interactions are not perturbed by the presence of MAN9 or FA2 glycans. Below, we demonstrate how our contact data may be applied to evaluate potential con-145 sequences of specific mutations, such as N501Y in the recently prevalent B.1.1.7 strain from the 146 UK. 147 Simulated interactions help assess the effects of mutations, such as N501Y in RBD 148 The detailed contact interactions, as revealed by our simulations (Figure 2a with increased affinity to host receptor [3] , but the molecular factors responsible for this is unclear. 153 Since the N501Y mutation site is at the RBD-ACE2 binding interface (Figure 3a ), we will examine 154 how this mutation may modulate the virus-host interaction, a factor that can determine infectivity. 155 For this, we performed in-silico mutagenesis of N501Y in the RBD-ACE2 complex, using the 156 FoldX software [19] . Utilizing the interaction statistics from Figure 2a as basis, together with a 157 structural and chemical-physical analysis, we argue below how the new N501Y strain facilitates 158 the binding of RBD to ACE2, which may contribute to enhanced infectivity. 159 To describe the consequences of N501Y, we first identify all ACE2 residues that frequently 160 interact with N501, the mutation site in RBD. From the simulations, N501 forms contacts with 161 Y41 and K353 of host ACE2 in 74% and 93% of the sampled configurations, respectively (see 162 Figure 2a and Table S1 ). The significant interactions provide reason to specifically focus on these 163 two host residues and assess how their contact formation with the N501Y mutation may alter 164 stability ( Figure 3b ). For this, we note that N501Y would enable the formation of stabilizing cation-165 π interaction with K353 of ACE2. In addition, the longer side chain of N501Y (relative to N501) 166 may facilitate the intermolecular hydrogen-bonding between the OH group of N501Y (tyrosine) 167 and ACE2 residues, including K353 (Figure 3b ). It has been shown that hydrogen bonds by tyrosine 168 OH groups can be a significant contributor to stability [20] . Hence, a tyrosine (N501Y) side chain 169 that allows for more favorable hydrogen bonds between molecules would further facilitate binding. 170 Moreover, the N501Y mutation introduces additional stabilizing contributions: i.e. the aromatic-171 aromatic interaction between N501Y and Y41 (i.e. π-stacking; Figure 3b ). Notably, N501Y would 172 lead to enhanced hydrophobic effects in the inside of the binding surface. This mutation would therefore create a more protein core-like environment, which stabilizes the RBD-ACE2 coupling. to what extent glycans of the host may interact with viral RBD. Understanding precisely which 182 glycans form close contacts with RBD will provide a more comprehensive view of the factors 183 determining affinity and may suggest additional strategies to modulate binding. To describe RBD-glycan interaction, we calculated the probabilities of RBD residues forming 185 contacts with any glycan that is attached to ACE2 (Figure 4 ). This analysis was performed for the 186 simulation sets that either included MAN9 (cf. Figure 1b) or FA2 glycans (cf. Figure 1c ) on ACE2. In both MAN9 and FA2 simulations (Figures 4a,b) , ACE2 glycans on Asn53, Asn90, Asn103, 188 and Asn322 form contacts with the RBD. Of these four glycans, the ones on Asn53, Asn103, and 189 Asn322 have lower contact probabilities (i.e. P ≤ 30%), suggesting that their interaction with 190 RBD is more transient despite their proximity to the binding surface. In stark contrast, the remain-191 ing glycan on Asn90 of ACE2 contacts the RBD with distinctly higher probabilities (P ≥ 70%), The prominent role of the Asn90 glycan as elucidated by our long, unrestrained simulations 200 is consistent with previous atomic force microscopy (AFM) measurements and steered molecular 201 dynamics (SMD) simulations (∼ 270 ns) [16] . In that study, AFM and SMD analyses suggest that 202 the Asn90 glycan can affect the association and disassociation of RBD and ACE2. As shown in 203 Figure 4 , the frequent contacts between Asn90 glycan and RBD implies that there are significant 204 steric effects associated with this glycan. Since RBD-ACE2 interface interactions are not perturbed 205 by glycans (as discussed above), the sterics of Asn90 glycan would hinder the unbinding of RBD, 206 hence stabilizing the RBD-ACE2 complex. This result helps explain why, in AFM unbinding ex-207 periments [16] , separating RBD and ACE2 requires higher forces when ACE2 glycans are not 208 removed. For the case when ACE2 is not bound to RBD, the excluded volume of the Asn90 glycan 209 is likely to interfere with the binding interface, thereby impeding the association of RBD. Consis-210 tent with this interpretation, mutation studies have shown that removal of the Asn90 glycan leads 211 to increased RBD binding events [17, 18] . Together, our simulations provide a mechanistic basis 212 for how the sterics of Asn90 glycan can impede the unbinding and binding of RBD. In previous SMD simulations [16] , which reported on glycan-RBD interactions, all glycans of 214 ACE2 were modeled as N-glycan core pentasaccharide, which is a minimum structure for all N-215 glycans. Thus, the results from Ref. [16] may not apply to specific glycan types that have distinct 216 structures or chemical-physical properties. Here, we specifically included MAN9 or FA2 glycans 217 in ACE2 in our simulations and have shown that they are associated with comparable glycan-218 RBD contact interactions (Figure 4a,b) . While the contacts made with RBD may be similar for 219 MAN9 and FA2, different chemical-physical properties of the glycans can lead to distinct binding 220 energetics of the RBD-ACE2 complex, as will be discussed in a later section of this manuscript. Interactions between Asn343 glycan of RBD and glycans of ACE2 may contribute to affinity 222 In addition to protein-protein and protein-glycan interactions, glycan-glycan interactions can 223 also be a contributor to RBD-ACE2 binding, and thus, infectivity. On the Spike RBD, we mod-224 eled the glycan at Asn343 as a complex FA2 glycoform, in accordance with previous experimental 225 studies [9, 21] . In our simulations, the Asn343 glycan of RBD forms contacts with the Asn53 and 226 Asn322 glycans of ACE2 in 20% and 40% of the sampled configurations (regardless of ACE2 227 glycan type, see Figure S3 ), suggesting that these interactions may be a determinant of the stabil-228 ity of the complex. Consistent with this notion, SARS-CoV-2 pseudovirus essays have shown that 229 removal of the RBD-Asn343 glycan leads to a 20-fold decrease in infectivity [11] . The functional 230 relevance of the RBD-Asn343 glycan may be reflected by the fact that this glycan is highly con-231 served in current GISAID SARS-CoV-2 sequences. Specifically, it is lost in only 3 out of 313, 826 232 sequences, according to cov.lanl.gov Spike alignment (as of January 16, 2021) . Interestingly, the 233 RBD-Asn343 glycan forms more frequent contacts only with the two glycans of ACE2 mentioned 234 above (i.e. the ones at Asn53 and Asn322), but does not form significant interactions with any pro-235 tein residues of ACE2 (less than 10% contact probabilities, see Figure S3 ). This suggests that the 236 RBD-Asn343 glycan may contribute to infectivity, as implicated by [11] , predominantly through 237 interactions with the Asn53 and Asn322 glycans of ACE2 ( Figure S3 ). calculation, the configurational entropy was estimated using a quasi-harmonic approach. We found 249 that the relative changes in entropy are similar for the MAN9 and FA2 simulations, and therefore, 250 did not include them in the binding energy calculations (see SI Methods and Figure S4a ). Hence, 251 the MM-PBSA energies discussed below can be used to approximate relative changes in binding 252 energetics, allowing us to infer how each glycan type can influence RBD-ACE2 affinity. type. It has been suggested that the ACE2 binding surface is overall negatively charged, while the 297 corresponding RBD interface is positive [24] . Here, we further determined that ACE2 and RBD binding [25, 26] . To develop effective strategies for treating COVID-19, one requires thorough insights into the 309 factors that determine the binding between viral RBD and human ACE2. To this end, recent struc-310 tural analyses [5] [6] [7] and mutational experiments [3, 4] have alluded to possible RBD-ACE2 in-311 teractions, as well as the role that glycans may play during this process. To extend these initial the Asn90 glycan is present [16] . On the other hand, when ACE2 is not in complex with RBD, the 343 significant steric presence of Asn90 glycan indicates that it would obstruct the binding site, thereby 344 hindering the RBD from associating with ACE2. This notion agrees with mutational experiments 345 demonstrating that removal of the Asn90 glycan leads to increased RBD binding events [17, 18] . 346 Here, by uncovering the prominent role of Asn90 glycan in forming contacts with RBD, our sim- [25, 29] . Sialation has also been known to improve thermal stability and solubility [30, 31] , 357 as well as better recognition of "self" versus "non-self" by Siglecs [32] , which are all favored 358 features for effective immunogens. Together, our simulations, along with these supporting experi- In all models used for simulations, the SARS-CoV-2 RBD domain is defined by residues T333-376 G526, and the ACE2 receptor by S19-D615. To avoid artificial charges at the protein ends, we 377 introduced N-terminal acetylated and C-terminal N-methylamide capping groups. The ACE2 struc-378 ture contains a zinc ion that is coordinated by H374, H378, E402, and one water molecule [5, 36] . Figure 1a ; see Table S1 for list of contacts). To elucidate the role of glycans, we repeated this analysis for simulations that either include (b) MAN9 (cf. Figure 1b) or (c) FA2 glycans (cf. Figure 1c) Structural representation of the RBD-ACE2 binding interface highlighting the residues that form persistent contacts (cyan in ACE2 and ice blue in RBD). Persistent contacts are those that form with at least 60% probability (see Table S1 ). Figure 2a) . Based on this, we assess how the N501Y mutation may alter the interactions with these ACE2 residues. As shown by dashed lines and their labels, N501Y would introduce additional stabilizing interactions with Y41 and K353 of ACE2, which will increase the binding affinity between RBD and ACE2. The ACE2 receptor has 7 potential N-glycosylation sites (PNGS), N53, N90, N103, N322, N432, N546, and N690. Of these, the first six glycans are present in the ACE2 residue range used in our simulations (i.e. S19-D615). It remains to be seen exactly how different glycan types at these sites affect ACE2 association with the viral RBD. It is known that post-translational glycan modifications are strongly dependent on expression cell lines and their glycosylation enzyme repertoire [1, 2] . Unfortunately, all currently available ACE2 studies were done using recombinant proteins expressed in non-native cells. This prevents a definite determination of native glycosylation pattern on the ACE2 receptor and their role in RBD binding. Literature suggests that DC-SIGN and L-SIGN lectins act as enhancer factors that facilitate ACE2 mediated virus infection [3] . These specifically recognize high-mannose glycans [4] , indicating that at least those glycans on ACE2 interacting with these lectins occur in oligomannose form. On the other hand, Zhao et al. [5] had previously applied sequential exo-glycosidase digestion to identify mainly biantennary N-linked glycans with sialylation and core fucosylation. Recently, Shajahan et al. [6] performed site specific mass spectrometry analysis of human ACE2 to indicate predominantly complex type glycosylation, with 60% biantennary, 85% fucosylated, and about half of them as sialated structures. Moreover, negatively charged sialic acids extensively found on complex glycans have been reported to play critical roles in viral Spike interaction [7] . A thorough understanding of the effects of glycosylation is thus necessary. Since each PNGS can have a varying distribution of glycan occupancies [6] , we modeled two divergent forms of N-glycans on the different ACE2 sites, namely the unprocessed 9-mannose (MAN9) oligomer and the enzymatically processed fucosylated 2-antennae type complex glycan (FA2) with commonly expected 2-3 linked [6] sialic acid tips. Since FA2 glycan type has been shown to be the major glycoform at the Spike glycosylation site 343 [8, 9] , this was selected as the glycan choice for RBD for all glycan-included simulations. 40 initial configurations were modeled for the MAN9 and FA2 simulations (i.e. 20 for each simulation set; cf. main text Figure 1b,c) . These initial configurations were prepared based on different RBD-ACE2 configurations taken from preliminary glycan-free trajectories. Glycan structures were built at the PNGS, with random orientations, using the ALLOSMOD package [10] of MODELLER [11] . This was succeeded by short simulated annealing with the protein backbone restrained to relax the glycosylated systems at different conformations, with the CHARMM36m forcefield [12] , following the glycoprotein modeling pipeline developed previously by our group [13, 14] . The CHARMM pdb and psf files were converted to AMBER format using the CHAMBER command available in the PARMED module of AMBERTOOLS 16 [15, 16] . Following the steps described, 20 different glycoprotein configurations were obtained for each of the MAN9 and FA2 glycosylated ACE2 systems, which were used for the 40 individual glycosylated trajectories of all-atom explicit-solvent simulations performed with the AMBER 16 software [15] . The binding energy between RBD and ACE2 was approximated using the Molecular Mechanics Poisson-Boltzmann Surface Area (MM-PBSA) method [17, 18] . To apply this method, we used the MMPBSA.py script [19] within AMBERTOOLS 16. MM-PBSA estimates the binding energy (∆G bind ) from the molecular mechanical energy (∆E MM ), solvation free energy (∆G sol ) and conformational entropy (∆S) as: T is the temperature; ∆E int is the internal energy from the sum of bond, angle, and dihedral terms; ∆E vdw is the van der Waals energy; ∆E ele is the electrostatic energy; ∆G PB is the electrostatic solvation free energy computed by the Poisson-Boltzmann (PB) method [20] ; and ∆G SA is nonpolar solvation free energy proportional to solvent accessible surface area and cavitation terms. ∆G SA implicitly includes the solvent entropy approximation by virtue of parameterization [21, 22] . Because the PB calculations are computationally very costly, 4 sets of randomly selected 200 snapshots were used from the complete ensembles for these calculations in order to obtain robust sampling and standard errors. calculate relative affinity changes and their agreement with experimental values [23] . The conformational entropy change (∆S) is not included in the present MM-PBSA analysis since entropy calculations are typically error-prone [24, 25] and have convergence difficulties [26, 27] . It has been shown that the inclusion of entropic terms from quasi-harmonic approximation provided no meaningful improvement in the agreement between the predicted and experimental energies, whereas other methods of conformational entropy calculations such as harmonic approximation entropies reduced the correlation [23] . Here, the quasi-harmonic conformational entropy was calculated from the eigenvectors of complete covariance matrix in GROMACS v5.1.2 [28] ( Figure S4a ). Since the relative changes in conformational entropy are similar between the MAN9 and FA2 simulations ( Figure S4a ), we did not included them in our binding energy calculations. Electrostatic potential calculation was performed using the Adaptive Poisson Boltzmann Solver [29] . Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glyco-444 protein SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is SUPPORTING FIGURES AND TABLES Schematic representation of N-glycans. Oligomannose (MAN9) and complex glycan (FA2) as used in the simulations, represented in Symbol Nomenclature for Glycans (SNFG) schematic. The intersugar connectivities are shown Impact of host cell line choice on glycan profile Appropriate glycosylation of recombinant proteins for human use: Implications of choice of expression system Specific asparagine-linked glycosylation sites are critical for dc-sign-and l-sign-mediated severe acute respiratory syndrome coronavirus entry Structural basis for selective recognition of oligosaccharides by dc-sign and dc-signr Inhibition of endoplasmic reticulum-resident glucosidases impairs severe acute respiratory syndrome coronavirus and human coronavirus nl63 spike protein-mediated entry by altering the glycan processing of angiotensin iconverting enzyme 2 Comprehensive characterization of N-and O-glycosylation of SARS-CoV-2 human receptor angiotensin converting enzyme 2 Distinct roles for sialoside and protein receptors in coronavirus infection Site-specific glycan analysis of the SARS-CoV-2 spike Deducing the Nand O-glycosylation profile of the spike protein of novel coronavirus SARS-CoV All-atom ensemble modeling to analyze smallangle x-ray scattering of glycosylated proteins Comparative protein structure modeling using modeller CHARMM36m: An improved force field for folded and intrinsically disordered proteins Visualization of the hiv-1 env glycan shield across scales Quantification of the resilience and vulnerability of hiv-1 native glycan shield at atomistic detail The amber biomolecular simulation programs Continuum solvent studies of the stability of DNA, RNA, and phosphoramidate-DNA helices Calculating structures and free energies of complex molecules: Combining molecular mechanics and continuum models MMPBSA.py: An efficient program for end-state free energy calculations Classical electrostatics in biology and chemistry MM-PBSA captures key role of intercalating water molecules at a protein-Protein interface Accurate predictions of nonpolar solvation free energies require explicit consideration of binding-site hydration Assessing the Performance of MM/PBSA, MM/GBSA, and QM-MM/GBSA Approaches on Protein/Carbohydrate Complexes: Effect of Implicit Solvent Models, QM Methods, and Entropic Contributions Dissecting Protein Configurational Entropy into Conformational and Vibrational Contributions Use of MM-PBSA in reproducing the binding free energies to HIV-1 RT of TIBO derivatives and predicting the binding mode to HIV-1 RT of efavirenz by docking and MM-PBSA Distinct glycan topology for avian and human sialopentasaccharide receptor analogues upon binding different hemagglutinins: A molecular dynamics perspective Molecular dynamics analysis of antibody recognition and escape by human h1n1 influenza hemagglutinin Gromacs: Fast, flexible, and free Electrostatics of nanosystems: Application to microtubules and the ribosome Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Structural basis of receptor recognition by SARS-CoV-2