key: cord-0701002-so70e6ut authors: Oliveira, A. Sofia F.; Shoemark, Deborah K.; Ibarra, Amaurys Avila; Davidson, Andrew D.; Berger, Imre; Schaffitzel, Christiane; Mulholland, Adrian J. title: The fatty acid site is coupled to functional motifs in the SARS-CoV-2 spike protein and modulates spike allosteric behaviour date: 2021-06-09 journal: bioRxiv DOI: 10.1101/2021.06.07.447341 sha: 5ca617138cffebc9a69770ce97d455f5995099a8 doc_id: 701002 cord_uid: so70e6ut The SARS-CoV-2 spike protein is the first contact point between the SARS-CoV-2 virus and host cells and mediates membrane fusion. Recently, a fatty acid binding site was identified in the spike (Toelzer et al. Science 2020). The presence of linoleic acid at this site modulates binding of the spike to the human ACE2 receptor, stabilizing a locked conformation of the protein. Here, dynamical-nonequilibrium molecular dynamics simulations reveal that this fatty acid site is coupled to functionally relevant regions of the spike, some of them far from the fatty acid binding pocket. Removal of a ligand from the fatty acid binding site significantly affects the dynamics of distant, functionally important regions of the spike, including the receptor-binding motif, furin cleavage site and fusion-peptide-adjacent regions. The results also show significant differences in behaviour between clinical variants of the spike: e.g. the D614G mutation shows a significantly different conformational response for some structural motifs relevant for binding and fusion. The simulations identify structural networks through which changes at the fatty acid binding site are transmitted within the protein. These communication networks significantly involve positions that are prone to mutation, indicating that observed genetic variation in the spike may alter its response to linoleate binding and associated allosteric communication. . Cryo-EM structure of the ectodomain of the SARS-CoV-2 spike trimer with linoleic acid (LA) bound to the fatty acid-binding sites (17) . (A) Three-dimensional structure of the complex of the locked (all RBMs occluded) ectodomain of the SARS-CoV-2 spike trimer with linoleic acid (PDB code: 6ZB5) (17) . The spike protein is a homotrimer (16) : each monomer is shown in a different colour, namely green, orange and blue. LA molecules are highlighted with spheres. Each fatty acid (FA) binding site is located at the interface between two neighbouring monomers, and is formed by residues from two adjacent receptor-binding domains. (B) Detailed view of the FA binding site: this pocket is lined by hydrophobic and aromatic residues and the LA acidic headgroup is close to R408 and Q409. Each monomer is formed of three domains: a large ectodomain, a transmembrane anchor and a short cytoplasmic tail (16) . The ectodomain comprises two subunits: S1 is responsible for binding to ACE2 (6, 16) , and S2 for viral-host membrane fusion (16, 18) . The SARS-CoV-2 spike contains two proteolytic cleavage sites (16) : one 'furin protease recognition' site at the S1/S2 boundary, thought to activate the protein (19) , and a second in the S2 subunit (S2´) that releases the fusion peptide (16, 18) . The SARS-CoV-2 spike contains three free fatty acid (FA) binding sites, each located at the interface between every two neighbouring receptor-binding domains ( Figures 1A and S3 ) (17) . The FA binding sites are lined by aromatic and hydrophobic residues ( Figure 1B ) and a positively charged residue from a neighbouring monomer, namely R408, which acts as an anchor for the FA carboxylate headgroup (17) . The open spike conformation, with at least one RBD pointing upwards, is needed to interact with ACE2 receptors on the human host cell. It was shown by surface plasmon resonance that the presence of the FA linoleic acid (LA) reduces binding of the spike to ACE2 (17) . In agreement, LA stabilises the locked spike conformation, in which the RBM is occluded and cannot bind to the human ACE2 receptor (17) , but there is no obvious connection between the FA sites and other structural motifs relevant for membrane fusion, or with antigenic epitopes. MD simulations showed persistent and stable interactions between LA and the spike trimer (17, 20) . These simulations also revealed that LA rigidifies the FA binding site, and these effects extend to the N-terminus domain (20) . The cryo-EM structure of the spike from pangolin coronavirus (which is closely related to SARS-CoV-2) shows that the spike also binds LA in an equivalent FA pocket (21) . An equivalent FA binding site was also found on the Novavax SARS-CoV-2 construct expressed and purified from insect cells (22) . Simulations, particularly atomistic molecular dynamics (MD) simulations, have provided crucial atomic-level insight into the structure, dynamics and interactions of the SARS-CoV-2 spike (13, 17, 20, (23) (24) (25) (26) (27) (28) . Here, we apply dynamical-nonequilibrium MD simulations (29) (30) (31) to investigate the response of the SARS-CoV-2 spike to LA removal. We have shown this approach to be effective in identifying structural communication pathways in a variety of proteins, e.g. in identifying a general mechanism of interdomain signal propagation in nicotinic acetylcholine receptors (32, 33) and mapping the networks connecting the allosteric and catalytic sites in two clinically relevant β-lactamase enzymes (34) . This approach is based on, first of all, equilibrium simulations of the system in question, which generate configurations for multiple dynamical-nonequilibrium simulations where the effect of a perturbation can be studied. Running a large number of nonequilibrium simulations allows for the determination of the statistical significance of the structural response observed. Dynamic response of the wild-type spike A model of the locked wild-type spike was created from the cryo-EM structure (PDB code: 6ZB5) of the SARS-CoV-2 spike protein bound to three linoleate molecules (17) . Missing loops were built to generate the wild-type sequence according to the Uniprot accession number P0DTC2 for the unglycosylated ectodomain of the spike bound with LA (for details, see Supplementary Material). The locked structure had 42 disulphides per trimer that remained intact and faithfully retained the structure and overall fold of the cryo-EM structure over the equilibrium simulation time (17, 20) . Three equilibrium MD simulations ( Figure S1 ), 200 ns each, were performed for the locked form of the unglycosylated and uncleaved (no cleavage at the S1/S2 interface) ectodomain of the spike bound with LA and used as starting points for 90 dynamical-nonequilibrium simulations ( Figure S2 ). Here we have used models of the uncleaved spike ectodomains in order to detect any potential effects on structurally distant sites influenced by ligand in the FA sites in the intact spike. In these nonequilibrium simulations, all LA molecules were (instantaneously) annihilated. This triggers a response of the protein, as it adapts to LA removal (top panel in Figure S2 ). This annihilation is carried out for multiple configurations sampled from equilibrium MD, and the comparison between the equilibrium and short dynamical-nonequilibrium MD trajectories identifies the structural response of the protein. Running multiple (in this case, 90) dynamical-nonequilibrium simulations reduces the noise associated with the structural response of the protein and allows for the determination of the statistical significance of the observed response. Nonequilibrium simulations of this type are emerging as an effective tool to study signal transmission and identify communication networks within proteins (32) (33) (34) (35) (36) . Here, the direct comparison between the equilibrium LAbound and nonequilibrium apo spike simulations using the Kubo-Onsager approach (29) (30) (31) (bottom panel in Figure S2 ), and the average of the results over all the 90 replicates, allows for identification of the temporal sequence of conformational changes associated with the response of the spike to LA removal (Figures 2 and S4) , and also the determination of their statistical significance ( Figure S5 ). The structure that we simulate here corresponds to the unglycosylated wild-type spike (Uniprot accession number P0DTC2), not cleaved at the 'furin protease recognition' site. It was built based on the cryo-EM structure that originally revealed the FA binding site (PDB code: 6ZB5) (17) . Although a few glycans (e.g. at positions N165, N234, and N343) have been shown to be involved in the spike infection mechanism by altering the dynamics of receptor binding domain opening (23, 37) , and the glycan shield plays a vital role in the biological function of the spike, the internal networks and response of the protein scaffold identified here are not likely to be qualitatively altered by the glycans, which predominantly cover the exterior of the spike. As Casalino et al. have shown, glycan dynamics are fast relative to the dynamics of the protein (23) . Note that the perturbation introduced here (LA annihilation) is not intended to mimic the physical process of LA (un)binding, but rather to promote a rapid response and force signal transmission within the protein, thus explicitly mapping the mechanical and dynamic coupling between the structural elements involved in this response. Note also that, due to the non-physical nature of the perturbation, the timescales observed for the protein's response do not represent the physical timescales of conformational change, however, the responses of similar systems (e.g. wild-type and D614G spike) can be meaningfully compared. The RBMs respond rapidly to LA removal (Figures 3 and S9 -S10), due to their close proximity to the FA sites. After 0.1 ns, significant structural rearrangements are already apparent in the RBMs, mainly in the A475-C488 segment (Figures 3 and S9-S10 ). Subsequently, a gradual increase in deviations is observed for A475-C488. The RBM lies between the β4 and β7 strands of the RBD and contains most of the residues that directly interact with ACE2 (38, 39) . This motif is one of the most variable regions of SARS-CoV-2 spike (40) and a major target for neutralising antibodies (41) (42) (43) . The NTDs also show a fast and significant response to LA removal, in particular H146-E156 and L249-G257 (Figures 4 and S11-S12). The NTD of the spike is a surface-exposed domain structurally linked to the RBD of a neighbouring monomer (17, 38) . Although directly coupled to the RBD, the NTD does not bind to ACE2 (5, 6) and its function in SARS-CoV-2 infection remains unclear. The spike NTDs of other related coronaviruses have been suggested to play a role in infection (44) (45) (46) and are known epitopes for neutralising antibodies (46, 47) . Human antibodies targeting the NTD of the SARS-CoV-2 spike have been isolated from convalescent COVID-19 patients (e.g. (47) (48) (49) ) and this region was shown to be a super-antigenic site (50) . A cryo-EM structure of the complex between the spike and the 4A8 monoclonal antibody shows that the NTD loops L141-E156 and R246-A260 (two of the regions that show the largest responses to LA removal in Figures 4 and S11-S12) directly mediate the interaction between the proteins (48) . Both of these loops are candidates for vaccine and therapeutic developments (48) . The conformational changes in the H146-E156 and L249-G257 segments are further transmitted, over the following 5 ns, to other parts of the NTD, namely S71-R78, N122-N125 and F175-F186. The N122-N125 segment is a conserved NxxN sequence motif present in the NTD of spikes from several coronaviruses, and its function remains unknown (51) . The F175-F186 region is located immediately before a recently identified epitope for human antibodies (49) . The S71-R78 segment is part of the GTNGTKR insertion shared by the SARS-CoV-2 and bat-CoV RaTG13 spikes but not the SARS-CoV spike (51) . This motif, which is also found in structural proteins of several other viruses, and proteins from other organisms, has been suggested to allow the SARS-CoV-2 spike to bind to other receptors besides ACE2 (51). The coupling identified here between the FA site and specific regions of the NTD is remarkable and highlights the complex allosteric connections within the spike, with distant sites apparently able to modulate the response of the NTDs. Both the furin cleavage site and V622-L629 region, which are more than 40 Å away from the FA site, respond notably to the removal of LA (Figures 5 and S13-S14 ). Both regions respond rapidly, with a significant conformational response observed almost immediately after LA removal. The furin site is located at the boundary between the S1 and S2 subunits (16, 17) and furin cleavage is thought to be important for the activation of the spike (19) . This site contains a polybasic PRRA insertion not found in other SARS-CoV-related coronaviruses (52) . Cellbased assays show that deletion of the PRRA motif affects virus infectivity (19, (52) (53) (54) (55) (56) . Note that in the simulations presented here, the furin site (located between R685 and S686) is not cleaved (see Supplementary Material). The furin site and V622-L629 regions are among the spike regions most affected by LA removal and show increasingly large deviations (larger than most other loop regions of the protein) over the simulations. The conformational changes in these regions propagate to segments immediately adjacent to the fusion peptide, namely the downstream FPPR and the upstream D808-S813. The FPPR is a ~25-residue segment located in S2 immediately downstream of the fusion peptide which has been suggested to play an essential role in the structural transitions between pre-and post-fusion conformations of the spike (18) . The D808-S813 region is located upstream of the fusion peptide, immediately preceding the S2' protease recognition and cleavage site (R815) (54) . Both proteolytic sites in the SARS-CoV-2 spike are known epitopes for neutralising antibodies (57, 58) . The close connection between the furin site, V622-L629, and the regions adjacent to the FP, identified here for the intact, wild-type spike is remarkable. Due to this crosstalk, mutations in or close to the furin site or V622-L629 are likely to affect signal transmission to the FPsurrounding regions, i.e. the FPPR and the S2' cleavage site. This is worthy of experimental investigation. We also performed equilibrium and dynamical-nonequilibrium simulations of the D614G mutant spike. The D614G mutation is now dominant in SARS-CoV-2 lineages circulating worldwide (59) In the original wild-type spike, D614 is located at the interface between monomers, with its sidechain directly interacting with residues across the subunit interface (55) . RMSF profiles for the wild-type and D614G apo spikes are similar ( Figure S15 ). However, unlike the wild-type with or without LA, one replicate of the D614G with LA bound exhibits enhanced fluctuation in the middle of the RBM corresponding to exposed loop residues Q474-N487 ( Figure S15 ). Though the RBD in the closed conformation remains inaccessible for binding to ACE2, residues Q474-N487 of the RBM (shown in magenta in the insert in Figure S15F ) may still provide a target for neutralizing antibodies in the closed conformation, depending on the degree of glycan shielding (65) . Our equilibrium MD simulations of the unglycosylated, uncleaved wild-type and D614G LA-bound spikes suggest that the D614G may enhance RBM mobility. It is possible that an increase in flexibility in the RBM could influence the efficiency of glycosylation in this region (66) , which could either increase or decrease the degree of glycan shielding, affecting epitope recognition. The trans-interface interactions of the carboxylate of D614 in the wild-type systems involve four potential candidate residues, K854, K835, Q836 and T859. In our simulations, T859 came Figure S16 ). An analogous analysis was performed on the D614G mutant to establish whether K854 makes alternative hydrogen-bond or salt-bridge contacts across the 3 subunit interfaces (averaged over 3 x 200 ns replicates) in the absence of a partnering D614 carboxylate. In the D614G spike, K854 fails to find any alternative salt-bridge and only occasionally comes within hydrogenbonding distance of residues Q613 and N317 ( Figure S18 ). This supports the inference drawn from cryo-EM structures of the head region of the D614G spike that this mutation disrupts the inter-monomer salt-bridge and hydrogen bond networks in this region, which may cause reduced stability of the trimer. This corresponds to the observation that the D614G mutant was mostly in an open conformation on the EM grids and suggests that loss of the D614-K854 interaction somehow destabilises the closed conformation (e.g. (67, 68) ). Dynamical-nonequilibrium simulations for the D614G variant were also performed to test whether the D614G mutation affects the response of the spike to LA. These simulations used as a starting point the distribution of conformations taken from the equilibrium simulations of the locked form of the unglycosylated and uncleaved D614G spike with LA bound. The same perturbation as for the wild-type spike was applied to the system, namely LA removal. The Kubo-Onsager approach (29) (30) (31) was again used to extract the response of the system ( Figure S19 ) and determine the statistical significance of the observed responses ( Figure S20 ). In the D614G variant, there is notably less symmetry across the monomers in the response of the spike to LA removal, compared to the wild-type (Figures 6 and S21-S28 ). For instance, the amplitude of the structural response of the V266-L629 and furin site regions in monomer C ( Figure S27 ) of the D614G is substantially smaller than in monomers B and A (Figures 6 and S28 ). The conformational responses of the wild-type and D614G spikes can be directly compared because the same perturbation was used for both in the dynamical-nonequilibrium simulations. The conformational response of the NTDs and RBDs to LA removal is generally similar in the wild-type and D614G spike ( Figures S21-S26 ) with small variations in the amplitude of the structural rearrangements of some functional motifs, e.g. RBMs. However, the D614G mutation significantly affects inter-monomer communication, with reduction of signal transmission from the furin site and V622-L629 of one monomer to the FPPR of another ( Figures 6 and S27-S28 ) compared to the wild-type (Figures 5 and S13-S14). In the D614G spike, only minor deviations of the FPPR are observed. Furthermore, the region located upstream of the fusion peptide, namely D808-S813, also shows different rates of signal propagation between the wild-type and D614G proteins (Figures 6 and S27-S28) . The differences identified here may relate to functionally important differences between the wildtype and D614G spikes. The results here show that the D614G mutation alters the allosteric networks connecting the FA site to the regions surrounding the FP, particularly the FPPR. There is reduced communication between the monomers in the D614G spike. As noted above, the response of the D614G spike to LA is also less symmetrical than the wild-type. Simulations of the D614G spike show that this mutation affects communication between the FA site and the FPPR and the S2' cleavage site. The D614G mutant shows reduced response of the FPPR and also a slower rate of signal propagation to the S2' cleavage site when compared to the wild-type protein (Movie 2). These results indicate that the D614G mutation affects the allosteric behaviour and the response to linoleic acid of the spike, which may be related to the changes in viral fitness associated with this mutation (69) . The results here further highlight the potential of dynamical-nonequilibrium simulations for identifying pathways of allosteric communication (32) (33) (34) and suggest that this approach may be useful in analysing mutations and differences in functionally important dynamical behaviour, and possibly different effects of linoleic acid, between SARS-CoV-2 spike variants of clinical relevance. ) time under an award for COVID-19 research, the Bristol UNCOVER Group and the University of Bristol, for their support. I.B. acknowledges support from the EPSRC Future Vaccine Manufacturing and Research Hub (EP/R013764/1). C.S. and I.B An interactive web-based dashboard to track COVID-19 in real time A pneumonia outbreak associated with a new coronavirus of probable bat origin Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Structural and functional basis of SARS-CoV-2 entry by using human ACE2 The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak What we know so far: COVID-19 current clinical knowledge and research Outcomes of cardiovascular magnetic resonance imaging in patients recently recovered from coronavirus disease 2019 (COVID-19) Endothelial cell infection and endotheliitis in COVID-19 Multiorgan and renal tropism of SARS-CoV-2 Microvascular Injury in the Brains of Patients with Covid-19 Neuropilin-1 is a host factor for SARS-CoV-2 infection A potential interaction between the SARS-CoV-2 spike protein and nicotinic acetylcholine receptors The coronavirus spike protein is a class I virus fusion protein: structural and functional characterization of the fusion core complex Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein Free fatty acid binding pocket in the locked structure of SARS-CoV-2 spike protein Distinct conformational states of SARS-CoV-2 spike protein Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein Molecular simulations suggest vitamins, retinoids and steroids as ligands of the free fatty acid pocket of the SARS-CoV-2 spike protein Bat and pangolin coronavirus spike glycoprotein structures provide insights into SARS-CoV-2 evolution Structural analysis of full-length SARS-CoV-2 spike protein from an advanced vaccine candidate Beyond shielding: the roles of glycans in the SARS-CoV-2 spike protein AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics A multiscale coarse-grained model of the SARS-CoV-2 virion The flexibility of ACE2 in the context of SARS-CoV-2 infection Biomolecular Simulations in the Time of COVID19, and After Distant residues modulate conformational opening in SARS-CoV-2 spike protein Thought-experiments by molecular dynamics Computer simulation in material science Non-equilibrium by molecular dynamics: a dynamical approach Identification of the initial steps in signal transduction in the α4β2 nicotinic receptor: insights from equilibrium and nonequilibrium simulations A general mechanism for signal propagation in the nicotinic acetylcholine receptor family Allosteric communication in class A β-lactamases occurs via cooperative coupling of loop dynamics Structural consequences of ATP hydrolysis on the ABC transporter NBD dimer: molecular dynamics studies of HlyB F508del disturbs the dynamics of the nucleotide binding domains of CFTR before and after ATP hydrolysis A glycan gate controls opening of the SARS-CoV-2 spike protein Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor The circulating SARS-CoV-2 spike variant N439K maintains fitness while evading antibody-mediated immunity Convergent antibody responses to SARS-CoV-2 in convalescent individuals Isolation of potent SARS-CoV-2 neutralizing antibodies and protection from disease in a small animal model Broad neutralization of SARS-related viruses by human monoclonal antibodies Mouse hepatitis virus receptor as a determinant of the mouse susceptibility to MHV infection Bat-to-human: spike features determining 'host jump' of coronaviruses SARS-CoV, MERS-CoV, and beyond Structural definition of a neutralization epitope on the N-terminal domain of MERS-CoV spike glycoprotein The N-terminal domain of spike glycoprotein mediates SARS-CoV-2 infection by associating with L-SIGN and DC-SIGN A neutralizing human antibody binds to the N-terminal domain of the Spike protein of SARS-CoV-2 SARS-CoV-2 proteome microarray for mapping COVID-19 antibody interactions at amino acid resolution Potent SARS-CoV-2 neutralizing antibodies directed against spike Nterminal domain target a single supersite Role of the GTNGTKR motif in the N-terminal receptor-binding domain of the SARS-CoV-2 spike protein A multibasic cleavage site in the spike protein of SARS-CoV-2 is essential for infection of human lung cells Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor Structure and mechanism of SARS-CoV-2 Spike N679-V687 deletion variant elucidate cell-type specific evolution of viral fitness The furin cleavage site in the SARS-CoV-2 spike protein is required for transmission in ferrets Identification of immunodominant linear epitopes from SARS-CoV-2 patient plasma Two linear epitopes on the SARS-CoV-2 spike protein that elicit neutralising antibodies in COVID-19 patients The coronavirus is mutating -does it matter? The emerging spectrum of COVID-19 neurology: clinical, radiological and laboratory findings The Spike D614G mutation increases SARS-CoV-2 infection of multiple human cell types SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity Tracking changes in SARS-CoV-2 spike: evidence that D614G increases infectivity of the COVID-19 virus Evaluating the effects of SARS-CoV-2 spike mutation D614G on transmissibility and pathogenicity Key residues of the receptor binding motif in the spike protein of SARS-CoV-2 that interact with ACE2 and neutralizing antibodies Site-specific glycan analysis of the SARS-CoV-2 spike Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant D614G mutation alters SARS-CoV-2 spike conformation and enhances protease cleavage at the S1/S2 junction Spike mutation D614G alters SARS-CoV-2 fitness The authors declare competing interests. CS and IB report shareholding in Halo Therapeutics Ltd related to this Correspondence.