key: cord-0681880-5ednhp6q authors: Morere, Jeremy; Hognon, Cécilia; Miclot, Tom; Jiang, Tao; Dumont, Elise; Barone, Giampaolo; Bignon, Emmanuelle; Monari, Antonio title: How fragile we are. Influence of STimulator of INterferon Genes, STING, variants on pathogen recognition and immune response efficiency date: 2021-07-12 journal: bioRxiv DOI: 10.1101/2021.07.12.452045 sha: 029e1359e40c87909b9363f5e1b2e8d029410230 doc_id: 681880 cord_uid: 5ednhp6q The STimulator of INterferon Genes (STING) protein is a cornerstone of the human immune response. Its activation by cGAMP upon the presence of cytosolic DNA stimulates the production of type I interferons and inflammatory cytokines which are crucial for protecting cells from infections. STING signaling pathway can also influence both tumor-suppressive and tumor-promoting mechanisms, rendering it an appealing target for drug design. In the human population, several STING variants exist and exhibit dramatic differences in their activity, impacting the efficiency of the host defense against infections. Understanding the differential molecular mechanisms exhibited by these variants is of utmost importance notably towards personalized medicine treatments against diseases such as viral infections (COVID-19, Dengue…), cancers, or auto-inflammatory diseases. Owing to micro-seconds scale molecular modeling simulations and post-processing by contacts analysis and Machine Learning techniques, we reveal the dynamical behavior of four STING variants (wild type, G230A, R293Q, and G230A-R293Q) and we rationalize the variability of efficiency observed experimentally. Our results show that the decrease of STING activity is linked to a stiffening of key-structural features of the binding cavity, together with changes of the interaction patterns within the protein. The defenses of evolved organisms, including humans, against pathogenic infection rely on finely-tuned biological machineries involving several cellular signaling mechanisms. The cyclic Guanosine mono phosphate-Adenosine mono phosphate Synthase-STimulator of INterferon Genes (cGAS-STING) pathway is a key player acting as a cytosolic DNA-or RNAprobe. After sensing the presence of exogenous genetic material it triggers the immune response through the production of type I interferon and cytokines. 1, 2 Indeed, the recognition of aberrant nucleic acid fragments in the cellular cytosol, such as those secreted by bacteria or resulting from viral infection, stimulates the cGAS enzyme, which produces cyclic guanosine adenosine monophosphate (cGAMP). Subsequently, cGAMP is sequestered by STING, inducing its activation and the final production of type I interferon and pro-inflammatory cytokines. These processes will also cause the promotion of downstream inflammatory signaling for the protection of uninfected cells and the stimulation of adaptive immune response. 3 As a consequence, STING is known to play a crucial and sometimes contrasting role in different biological responses including antiviral defense, 3,4 the mediation of tumorsuppressive and tumor-promoting mechanisms, 5-7 autophagy, 8, 9 skin wound healing 10 and auto-inflammatory diseases development. 11, 12 Its delayed activation might also be involved in severe COVID-19 outcomes. 13, 14 In this context, it has also been underlined that the overstimulation of the cGAS-STING pathways leads to an inflammatory-like cytokine response, which is also strongly correlated with severe forms of the SARS-CoV-2 infection. ? Hence, the modulation of the cGAS-STING pathway and the related regulatory proteins provides suitable targets for the development of a wide variety of potential anticancer, anti-pathogen, as well as anti-inflammatory drugs and vaccines. [15] [16] [17] [18] As a matter of fact, the structure of the human STING protein has been entirely resolved and mechanistic hypothesis about its activation have been sketched so far. [19] [20] [21] STING is a transmembrane protein, which is mainly localized in the endoplasmatic reticulum (ER) and is composed of two equivalent monomers. From a structural point of view one can distinguish a N-terminal transmembrane domain, having a high density of α−helices, a cytoplasmexposed C-terminal domain, containing the cGAMP binding site, and a short linker region connecting the two domains -see Figure 1 -A. Upon recognition and binding with cGAMP the C-terminal domain undergoes an important structural reorganization which ultimately results in the polymerization of different STING monomers, hence in the activation of the immune response. The cGAMP binding site is constituted by a pocket in the C-terminal domain which is surrounded by overhanging tails (lid regions), forming flexible random coils in the apo form that stiffen into β−sheets in presence of the ligand. Experimental and theoretical studies have stressed out the importance of Arg232, Arg238, Tyr167, Ser241, Thr263, Thr267 for stabilizing the ligand within the cavity. 20, 22 From a biochemical point of view STING polymerization, crucial for its full activation, takes place through the exposure of two cysteine residues in the linker domain which leads to the formation of a sulfur bond bridging the two monomers. The dimerization efficiency strongly depends on the solvent accessibility of these residues which are embedded in a rather flexible protein region, which may nonetheless assume an α−helix arrangement. Their solvent exposure is also modulated by the shielding effects caused by two disordered C-terminal tails whose conformation can be strongly affected by the ligand-induced structural transition further justifying the cGAMPinduced activation. 24 Indeed, these combined differences induce variability in exogenous DNA-or RNA-sensing and consequently in the response to pathogen infections. Notably, the HAQ and R232H genotypes are associated with poor outcome in patients suf-fering from cervical cancer. 25 Individuals carrying the HAQ polymorphism are more likely to contract the Legionnaires' disease, 26 probably more susceptible to infections, and less responsive to DNA vaccines. 27 Interestingly, the loss-of-function held by the HAQ variant might be mostly attributed to the R71H and R293Q substitutions, while the G230A polymorphism would help maintaining partial response to bacterial cyclic dinucleotides. 28 On the contrary, the R293Q substitution might provide enhanced protection against aging-associated diseases. 29 However, gain-of-function variants might also contribute to auto-inflammatory diseases development. On the bases of all these considerations, we used state-of-the-art all-atom molecular dy- As mentioned in the introduction we have chosen a reduced model involving only the Cterminal domain of STING and the disordered linker region. Hence, we have excluded from our model all the N-terminal transmembrane domain. If this choice induces a drastic simplification of the model it also allows to take into account a reduced-size systems for which the statistical sampling will be deeper. It also allows to concentrate on the effects of the cGAMP binding and mutations, while neglecting the transmembrane effects. It is worthy mentioning though, that while our model is suited to explore the rigidification of STING and the differential effects of the mutants, it lacks an important element, i.e. the N-terminal disordered tails that protrudes outside the lipid membrane and are susceptible to interact with the linker region contributing to the modulation of the accessibility of the dimerization site. If we are aware of some of the biases induced by our choice it is important to underline that the disordered chains could be included only in presence of the full system, hence limiting the statistical sampling. Furthermore, their disordered nature will be particularly challenging to be captured with conventional force field that could lead to nonphysical over structuring of those domains. We should also stress out once more that in this contribution we mainly want to understand the effects of the STING mutations on the binding capability and on the linker domain structure. For the apo systems, the starting structures were generated based on the C-terminal and linker domains from the cryoEM structure of the full-length human STING (PDB ID 6NT5 19 ). The missing loops were reconstructed using SwissModel. 36 The starting models for the systems with cGAMP were generated by homology to the cryoEM structure of the chicken STING harboring cGAMP (PDB ID 6NT7 19 ), still with SwissModel. From these starting structures, the variants were built by mutating in silico the residues 230 and/or 293. Force field parameters for the cGAMP ligand were generated using the antechamber module of AMBER18 37 for the derivation of RESP charges 38 and the attribution of GAFF parameters 39 -see parameters in SI. Standard STING residues were modeled using the ff14SB amber force field. 40 The system was soaked in a cubic TIP3P water box with a 15Å buffer and potassium counter-ions were added to ensure a neutral total charge, resulting in systems of ∼135,000 atoms. MD simulations were carried out using NAMD3 41 for the dominant genotype of the human STING (wild type, WT) in its apo form and in the cGAMP-bound state. The Hydrogen Mass Repartitioning Method (HMR) was used to allow a 4 fs time step for the integration of the equations of motion. To prepare the system, 10,000 minimization steps were firstly performed imposing positional constraints on the protein backbone. Minimization run has been followed by 12 ns equilibration at 300K during which the constraints have been progressively released. The temperature has been kept constant using the Langevin thermostat with a 1.0 ps −1 collision frequency, electrostatic interactions were treated using the Particle Mesh Ewald (PME) protocol. 42 After equilibration, the conformational ensemble was sampled along a 500 ns production run and structures were dumped every 40 ps. The same protocol was used for sampling three mutated states, involving the G230A (A-STING), the R293Q (Q-STING) and the G230A/R239Q mutation (AQ-STING), respectively. The starting protein structures were built manually by performing the point mutations from the WT system. Note that in the limit of our truncated system the AQ-STING can be considered of the highly spread and loss-of-function-inducing HAQ genotype. The cpptraj module of AMBER18 37 was used to calculate distances, angles and root mean square deviations (RMSD) and to perform the clustering analysis. The latter was carried out according to deviations of the protein backbone and structures were clustered into 5 groups. The opening angle of the protein was computed as the angle involving the center of mass of the S162 residues lying at the bottom of the binding cavity, and the residues forming the βsheet of the upper lobe of each STING monomer. The propensity of arginines to dive inwards the cavity was computed with respect to their distance to the S162 residues center of mass. Contacts analysis changes upon mutations of STING were computed using the GetContacts software (https://getcontacts.github.io/). Frequencies of contacts were calculated for each pair of residues and the most different patterns (75% threshold) identified among the STING variants were plotted as heatmaps using the ggplot2 package of R. 43 Representations of the STING structure and projection of the contacts perturbation were rendered by VMD. 44 The perturbation of the free energy of binding upon mutation was assessed by Thermodynamic Integration. The soft core potential method was used to progressively alchemically mutate G230 to A or R293 to Q. The system being dimeric, each polymorphism implies two mutations in the system. To deal with it, we computed the delta G of binding by computing the thermodynamic cycle for one mutation on the first monomer, then a second thermodynamic cycle adding the mutation on the second monomer. Free energy calculations on the AQ double mutant were carried out from the A system, in two steps as well. 10,000 steps minimization, 60 ps-long thermalization and 1 ns production runs were performed with pmemd 37 along 11 windows with lambda values varying from 0.0 to 1.0, and the convergence was further checked. Molecular dynamics simulation can provide important insight of the chemical and physical behavior of protein, however, the large dimensionality of the data obtained sometimes make it difficult to grasp the essence of the behavior of Globally the binding of cGAMP induces a very stable interaction network in the binding pocket. The latter involves, in addition to the R238 cation-π interactions already described, Y167 π-stacking with the cGAMP purines. Furthermore, hydrogen bonds between R238 and the ligand's phosphates and between the nucleobases and E260 and T263 are also emerging -see Figure 2 . These residues have been previously proposed to take part in cGAMP recognition 46 and we also retrieve the previously-reported amino acids T267 in the second sphere of interaction together with Y163 and Y240 -see Figure S3 . The R232 residues invoked in the literature 22 is instead located on the external face of the lid and stays relatively far from cGAMP all along the trajectory in the WT, but, as it will be discussed in the following, it shows a more important role in the A and AQ variants. Interestingly, the S162 residue of both monomer, which lies at the bottom of the cavity and whose mutation to T or A destabilizes the cGAMP:STING complex, 46 is also involved in the second sphere of interaction in our simulations. In order to further probe the perturbation of the physical and structural properties resulting from the binding of cGAMP, we used a Machine Learning protocol based on principal component analysis (PCA) to post-process the MD trajectory and determine the flexibility profile of the protein. We successfully used this methodology on other DNA and proteic systems. [47] [48] [49] The comparison of the WT STING flexibility profile in the apo and cGAMP-bound states underlines the stiffening of the lid region (residues 225-240) coupled to its structuring into a stable β−sheet as a result of the arginines diving towards the ligand -see Figure 2 . Interestingly, one can instead distinguish an enhanced flexibility in the cytosol-transmebrane linker region, opposite to the binding pocket and harboring the cysteine residues that are involved in the disulfide bridge formation leading to the subsequent multimerization and to STING activation. The assessment of the cysteine exposure to the solvent also shows an increase of the number of water molecules around these residues in the bound state, suggesting that the residues are more accessible, hence more prone to encounter the reactive partners -see Figure S4 . Nevertheless, as we used a truncated model in our simulations, this conclusion should be taken with some caution since we can not fully conclude that the same behavior would happen in the full-length structure. Yet the former stands as an interesting hypothesis on the allosteric regulation of STING activation that deserves to be investigated in further studies. suggest that higher melting temperatures for the G230A cGAMP:STING complex might be related to a structural stabilization of the lid region, the ligand affinity of this variant being similar to the one observed for the WT. 46 In line with these observations, the binding free energy change upon G230A mutation predicted by Thermodynamic Integration calculations is of only 1.20±0.25 kcal/mol -see Table 1 . Table 1 : Relative free energy of binding (∆∆G) in kcal/mol upon mutation of the WT STING into A, Q, and AQ models as computed by Thermodynamic Integration calculations. ∆∆G error G230A 1.20 ± 0.25 R293Q 1.11 ± 0.65 G230A-R293Q 9.40 ± 0.68 On the contrary the strongly enhanced structuration of the lid region upon cGAMP binding is well evidenced by the flexibility profile -see Figure 4 . The lid stiffening impacts the binding site organization around cGAMP. On the guanosine side of the ligand, the G230A mutation hampers the interaction of R238 with cGAMP. R238 is pushed further from the purine than in the WT, yet it is proximal enough to interact with the phosphate by hydrogen bonds. Interestingly, the nucleobase position is mainly maintained by π-stacking with Y167 and hydrogen bond with E260. Contrary to the WT, R232 dives towards the ligand to persistently interact with the phosphate group. T263 also interacts with the latter instead of the nucleobase in the WT structure. On the adenosine side of cGAMP, one retrieves the nucleobase π-stacking with Y167 and hydrogen bond with T263, as well as the interaction between R238 and the phosphate, although R232 is again further from the nucleobase than what is found for the WT-STING, preventing cation-π interactions -see Figure 3 . Altogether, the influence of G230A mutation on STING function can be associated with the favoring of an open conformation in the apo state which is ideal to assure recognition and binding of cGAMP. Although, the interaction network in the binding pocket is altered, the ligand is still stabilized and the same increased flexibility and cysteine exposure in the linker region is observed -see Figure S4 . Hence, one can conclude that the presence of the G230A mutation should favor global STING activation. Differently from the previous case, the R239Q mutation is located further from the binding pocket. Here, the Q-STING binding site harbors cGAMP in a similar fashion to the WT -see Figure S7 . Both R238 residues interact through cation-π interactions and hydrogen bonds with the nucleobases and the phosphates of cGAMP. E260, Y167 and T263 also participate to the interactions network within the cavity, and the R232 residues remain in the second sphere of interaction yet pointing towards the bulk -see Figure S3 . Interestingly, in the apo state, both R238 form very stable hydrogen bonds with the E260 on the facing αhelices, which promotes a closed conformation characterized by an opening angle of 116.0± 0.2 • . The more moderate opening of the binding pocket might disfavor cGAMP access and hence its recognition -see Figure 5 . Besides, the contact map underlines, for Q-STING, much more frequent interactions within the lid region itself but also with the surroundings spanning the residues 225 to 240 region in both monomers. Interestingly, these contacts are more pronounced than for the other variants and exhibit a contrasted pattern compared to WT-STING -see Figure S5 . Conversely, cGAMP binding still induces a considerable enhancement of the flexibility of the linker regions, hence a more solventexposed cysteine. As a consequence, and in particular due to the promotion of a more closed conformation in the apo state that is susceptible to strongly perturb the recognition of cGAMP, this mutation should be correlated to a loss of activity of STING. The AQ model, which may be directly related to the loss of function HAQ STING genotype, presents an organization of its binding site much alike the A variant. In the apo form, Y167 is closed to E260 an T263, while R238 remains in the cavity and R232 is in the bulk. Upon ligand binding, R232 dives towards the cavity and interacts with the phosphatesee Figure 6 . Contrary to the A variant, the π-stacking interaction with the guanosine is not disrupted, and R238 does not enters the cavity but rather interacts on the side. We retrieve here the interaction between cGAMP and E260, Y167 and T263. Importantly however, As concern the role of mutations, contrasting effects have been evidenced depending on the specific mutation. However, we may recognize a remodeling of the protein rigidity STING is an endoplasmic reticulum adaptor that facilitates innate immune signalling Cyclic GMP-AMP synthase is a cytosolic DNA sensor that activates the type I interferon pathway STING: infection, inflammation and cancer AMP-activated kinase (AMPK) promotes innate immunity and antiviral defense through modulation of stimulator of interferon genes (STING) signaling STING activation of tumor endothelial cells initiates spontaneous and therapeutic antitumor immunity Activating cGAS-STING pathway for the optimal effect of cancer immunotherapy DNA sensing by the cGAS-STING pathway in health and disease STING directly activates autophagy to tune the innate immune response The Evolution of STING Signaling and Its Involvement in Cancer Activation of STING signaling accelerates skin wound healing Activated STING in a vascular and pulmonary syndrome STING activation by translocation from the ER is associated with infection and autoinflammatory disease COVID-19 as a STING disorder with delayed over-secretion of interferon-beta Lymphocyte changes in severe COVID-19: delayed over-activation of STING? Antitumor activity of a systemic STINGactivating non-nucleotide cGAMP mimetic A diamidobenzimidazole STING agonist protects against SARS-CoV-2 infection Prolonged activation of innate immune pathways by a polyvalent STING agonist An orally available non-nucleotide STING agonist with antitumor activity Cryo-EM structures of STING reveal its mechanism of activation by cyclic GMP-AMP STING polymer structure reveals mechanisms for activation, hyperactivation, and inhibition Structure of STING bound to cyclic di-GMP reveals the mechanism of cyclic dinucleotide recognition by the immune system Distinct Dynamic and Conformational Features of Human STING in Response to 2 3-cGAMP and c-di-GMP TMEM173 variants and potential importance to human biology and disease A novel STING1 variant causes a recessive form of STING-associated vasculopathy with onset in infancy (SAVI) Association of homozygous variants of STING1 with outcome in human cervical cancer The common HAQ STING variant impairs cGAS-dependent antibacterial responses and is associated with susceptibility to Legionnaires' disease in humans Identification and characterization of a loss-of-function human MPYS variant Single nucleotide polymorphisms of human STING can affect innate immune response to cyclic dinucleotides Puzianowska-Kuznicka, M. STING SNP R293Q is associated with a decreased risk of aging-related diseases Ligand Strain and Its Conformational Complexity Is a Major Factor in the Binding of Cyclic Dinucleotides to STING Protein Dynamic structural differences between human and mouse STING lead to differing sensitivity to DMXAA Single mutations reshape the structural correlation network of the DMXAA-human STING complex Binding-pocket and lid-region substitutions render human STING sensitive to the species-specific drug DMXAA CDNs-STING Interaction Mechanism Investigations and Instructions on Design of CDN-Derivatives Computational study on new natural compound agonists of stimulator of interferon genes (STING) SWISS-MODEL: homology modelling of protein structures and complexes A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model Development and testing of a general amber force field Simmerling, C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB Scalable molecular dynamics on CPU and GPU architectures with NAMD A smooth particle mesh Ewald method Elegant Graphics for Data Analysis VMD: visual molecular dynamics Molecular Insights from Conformational Ensembles via Machine Learning Protein-Ligand Interactions in the STING Binding Site Probed by Rationally Designed Single-Point Mutations: Experiment and Theory Nucleosomal embedding reshapes the dynamics of abasic sites A Dynamic View of the Interaction of Histone Tails with Clustered Abasic Sites in a Nucleosome Core Particle Recognition of a tandem lesion by DNA bacterial formamidopyrimidine glycosylases explored combining molecular dynamics and machine learning. Computational and structural biotechnology journal A novel transcript isoform of STING that sequesters cGAMP and dominantly inhibits innate nucleic acid sensing TRIM32 protein modulates type I interferon induction and cellular antiviral response by targeting MITA/STING protein for K63-linked ubiquitination Cyclic dinucleotides trigger ULK1 (ATG1) phosphorylation of STING to prevent sustained innate immune signaling STING palmitoylation as a therapeutic target Ploss, A. Species-specific disruption of STING-dependent antiviral cellular defenses by the Zika virus NS2B3 protease DNA-induced 2 3-cGAMP enhances haplotypespecific human STING cleavage by dengue protease The authors are grateful to the GENCI HPC for computational resources on the IDRIS Jean