key: cord-0903778-19i22dsk authors: Chaturvedi, Parth; Han, Yanxiao; Král, Petr; Vuković, Lela title: Adaptive Evolution of Peptide Inhibitors for Mutating SARS‐CoV‐2 date: 2020-10-08 journal: Adv Theory Simul DOI: 10.1002/adts.202000156 sha: 1fe2bfc43877027d04a785f35e05aeed77166eb4 doc_id: 903778 cord_uid: 19i22dsk The SARS‐CoV‐2 virus is currently causing a worldwide pandemic with dramatic societal consequences for the humankind. In the past decades, disease outbreaks due to such zoonotic pathogens have appeared with an accelerated rate, which calls for an urgent development of adaptive (smart) therapeutics. Here, a computational strategy is developed to adaptively evolve peptides that could selectively inhibit mutating S protein receptor binding domains (RBDs) of different SARS‐CoV‐2 viral strains from binding to their human host receptor, angiotensin‐converting enzyme 2 (ACE2). Starting from suitable peptide templates, based on selected ACE2 segments (natural RBD binder), the templates are gradually modified by random mutations, while retaining those mutations that maximize their RBD‐binding free energies. In this adaptive evolution, atomistic molecular dynamics simulations of the template‐RBD complexes are iteratively perturbed by the peptide mutations, which are retained under favorable Monte Carlo decisions. The computational search will provide libraries of optimized therapeutics capable of reducing the SARS‐CoV‐2 infection on a global scale. The very fast spreading of a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the human population has already led to hundreds of thousands of fatalities and dire socioeconomic effects worldwide. Therefore, many concepts and strategies were initiated to identify drug targets, [1] develop effective therapeutics and vaccines against SARS-CoV-2, [2, 3] including the optional use of therapeutics developed for other purposes. [4] [5] [6] [7] Some of the strategies have focused on direct targeting of the spike (S) protein, [8, 9] located on the outer surface of the SARS-CoV-2 virion, which initiates the cell entry process. This process Figure 1 . Mutations of SARS-CoV-2. a) Time-dependent mutation tree of SARS-CoV-2 colored according to geographical regions of origin (through June 2020). [21] b) Twenty-five single mutations identified in RBD of the S protein. c) Five amino acid mutations on RBD in contact with the ACE2 receptor. d) Binding free energies are evaluated with the MMGB-SA method for the ACE2-RBD complexes, including the originally reported RBD (wild type, labeled as WT) and the five mutant RBDs listed in panel (c). amino acids that form a part of the ACE2-binding surface (A475V, G476S, S477I, V483A, and V503F), as highlighted in Figure 1c . In Figure 1d , we present the RBD-ACE2 host-receptor binding free energies, ΔG MMGB-SA , obtained in the presence of these five mutations. The 5 RBD:ACE2 complexes are simulated for 30 ns, and their average binding energies are obtained in the last 15 ns. The originally reported RBD and the S477I RBD have the strongest binding to the human ACE2, ΔG MMGB-SA ≈ −50 kcal mol −1 , while the other systems bind with ΔG MMGB-SA ≈ −(40-50) kcal mol −1 . In order to block all these RBD variants, the peptide inhibitors should have comparable or lower ΔG MMGB-SA values. Two ACE2-based peptide structures, shown in Figure 2a , are selected as templates for the first generation peptide inhibitors of the S protein. [8] The smaller template-1 includes single truncated 1 helix of ACE2 (amino acids 21-43), and the larger template-2 includes two 1 2 helices of ACE2 (amino acids 19-83). In the adaptive evolution search for optimized therapeutics, the selected ACE2-extracted peptide templates will be gradually mutated to increase their binding strength to RBD. In recent mutagenesis experiments, the whole ACE2 with single mutations in regions directly contacting RBD were examined for their binding to the original S protein. [22] To perform preliminary testing of our adaptive evolution search of peptide therapeutics, we first selected the most fit mutants from these experiments, and implemented their mutations in our templates-1,2. We simulated 22 peptides, that is, the original templates and their 10 single mutants, complexed with the original S protein RBD. Their free energies of binding, ΔG MMGB-SA , were evaluated in 100 ns simulations and presented in Figure 2b . Template-1 binds to RBD with ΔG MMGB-SA ≈ −19 kcal mol −1 , while its mutants have higher affinities giving ΔG MMGB-SA ≈ −(24-35) kcal mol −1 . In all cases, template-1 significantly changes its conformation in the bound configuration, as the helix loses the curvature observed when within ACE2, and the hydrogen bonding between Glu35 (template-1) and Gln493 (RBD), enabled by the helix curvature, is broken. In contrast, template-2 has more direct contacts with RBD than the shorter template-1 variants, so it binds to it more strongly, ΔG MMGB-SA ≈ −36 kcal mol −1 . However, only two template-2 mutants (H34A and K31W) have higher affinities to RBD than the original template-2, having ΔG MMGB-SA ≈ −45 kcal mol −1 . These simulations also revealed that peptides with a stronger binding covered larger RBD sections ( Figure S1 , Supporting Information), and reduced the RBD exposure to other potential binding partners. These results show that the experimental results obtained for mutated ACE2 [22] (Figure S2 , Supporting Information) can provide a good guidance in the mutation of template-1, but the same mutations are less effective in the larger template-2. The above results have clearly demonstrated that suitable peptide templates with appropriate mutations can acquire a strong binding to specific targets. To optimize such peptides Modeling of peptide-RBD complexes. a) Complexes of S protein RBD (blue) and two peptide templates (red). Locations of five S protein mutations examined in the present work are marked by blue spheres. Amino acid residues changed in singly mutated peptides are marked by yellow spheres. b) Free energies of binding, ΔG MMGB-SA , between the originally reported S protein RBD and the wild type or singly mutated ACE2-based peptides. Locations of mutated peptide amino acids are highlighted in panel a. c) Snapshots of initial and optimized template-1 peptides binding to the original RBD. The sequence of the optimized peptide was obtained after 100 mutation attempts, with 10 ns long MD simulation after each mutation. The final peptide with the optimized sequence was further relaxed in a 175 ns MD simulation. The initial peptide is shown as a red helix, with amino acids that are subsequently mutated shown in thin faded yellow licorice. The optimized peptide is shown as an orange helix, with mutated amino acids shown in thick yellow licorice. d) Adaptive evolution of template-1. The plot shows the binding free energies, ΔG MMGB-SA , between the peptide and the original RBD, presented as a function of the performed mutation, where the peptide:RBD complexes are relaxed for 10 ns after each mutation attempt. e) The time evolution of ΔG MMGB-SA between the final optimized peptide and the original RBD. The average value, obtained from the last 75 ns of the trajectory (gray), is ΔG MMGB-SA = −57 kcal mol −1 . The faded green line shows the data points calculated every 0.1 ns, and the dark green line shows the running average. f) Initial and optimized sequences of template-1 peptides. The final peptides were optimized for binding to the original and mutant RBDs, with peptide-RBD complexes relaxed in 10 ns MD simulations after each attempted mutation. against specific viral strains, we have developed combined mutation/selection (evolution) computational algorithms which can guide a multi-step adaptive evolution of the peptides: 1) Random mutations are introduced into random positions of the peptide templates. 2) Short MD simulation trajectory of the mutated-peptide:RBD complex are run and followed by a selection or rejection of the mutation via Monte Carlo (MC) sampling using a Metropolis criterion applied to the change of the free energy of peptide-RBD binding, ΔG MMGB-SA (Section 2). 3) The mutation/selection process is iteratively repeated until the binding affinity of peptides to the target S protein RBD is satisfactory (Section 2). 4) Additional evolution of the molecules might be considered after the MC decisions to allow for a better internal relaxation of the molecules. Due to a partly stochastic nature of MD simulations, the randomness in mutations, and the MC selection, different peptides can be obtained in separate trajectories that correspond to separate local minima of the free energy surface. These peptides form a pool (ensemble) of potential therapeutics evolved for a selected viral strain, which can be further enriched by considering multiple viral strains. In Figures 2c-e, Figure S3 , Supporting Information, and Table S1 , Supporting Information, coupling of template-1 to RBD was optimized in the adaptive evolution, where 10 ns MD simulation trajectories were generated after each trial mutation of template-1. Of the 100 mutations attempted, 13 mutations were accepted, and 11 amino acids were changed (individual residues can be mutated more than once). Figure 2d reveals the progression of binding free energies with the mutations of template-1, starting from ΔG MMGB-SA = −19 kcal mol −1 . As detailed in Figure 2c , during the adaptive evolution, the mutating helical peptide lost its bending (this change is independent of mutations) and multiple initial contacts with RBD. At the same time, it shifted with respect to its initial position and formed many new contacts. Peptide residue E37 formed a salt bridge with the original RBD, while residues Q24, Y26 (mutated), Q30 (mutated), S41 (mutated), and R42 (mutated) formed hydrogen bonds of varying stability with the original RBD. The resulting peptide bound to RBD with ΔG MMGB-SA = −70 kcal mol −1 at the end of thirteen 10 nslong trajectories (associated with individual accepted mutations). As shown in Figure 2e , additional 175 ns relaxation of this peptide resulted in a slightly less favorable ΔG MMGB-SA ≈ −57 kcal mol −1 . Therefore, adaptive evolution requires sufficiently long relaxation times for a good stabilization of the whole system. Short relaxation times may result in incomplete peptide adjustments and free energies that can be misleadingly favorable. Moreover, a faster MC convergence could be achieved by considering the whole free energy changes rather than the peptide-RBD binding free energies. However, internal reorganizations of molecules inevitably take part in long trajectories, so the difference in binding energies alone might be sufficient for the MC decision, as long as additional relaxation is allowed between individual MC steps (point 4 in the method). Next, we adaptively evolved template-1 coupled to three separate singly-mutated RBDs, chosen from Figure 1d . For simplicity, 100 mutations were attempted, followed by 10 ns simulations after each attempt. The adaptive evolution gave peptides with ΔG MMGB-SA ≈ −(45 to 70) kcal mol −1 , as summarized in Figure 3a-c and Figure 2f . Peptides targeting A475V and G476S RBDs each had five accepted mutations, respectively, while peptides targeting S477I RBD had 19 accepted mutations. In the A475V RBD case, one of the early accepted mutations lead to breaking of the helix secondary structure, and thus to a different peptide-RBD binding mode. This shows that individual alpha helices without additional stabilization, such as by side branching, [23] might be too simplistic therapeutics. In Figure 3d , the adaptive evolution was performed with a more stable template-2 ( 1 2 ), but random mutations were only introduced into the 1 helix, which was in direct contact with the original RBD. After 12 accepted mutations and 10 changed amino acids (listed in Table S2 , Supporting Information), the binding strength increased from ΔG MMGB-SA = −36 kcal mol −1 to −60 kcal mol −1 . Therefore, the adaptively evolved template-2 can compete with the whole ACE2, having ΔG MMGB-SA ≈ −50 kcal mol −1 (Figure 1c ). Both the initial and optimized template-2 preserve the curvature of the 1 helix, despite the mutation of E35 (to Y35), which is observed to interact with Q493 (RBD). This feature preserves the binding pattern observed in ACE2:S protein RBD complex. [1, 24] However, the salt bridge between D30 (initial peptide) and K417 (RBD) (shown in Figure S4 , Supporting Information for template-2 optimized with 1 ns simulations after mutations) is lost in the peptide optimized with 10 ns simulations after mutations. In summary, we have demonstrated that ACE2-based peptide templates can be adaptively evolved by computation using mutation/selection processes to form optimized inhibitors for a strong and competitive S protein RBD binding. The developed approach can be used to design peptide inhibitors based on templates extracted from different ACE2 polymorphs, including those from other species, [25] and other proteins binding to viral pathogens. The optimized inhibitors obtained in different evolution runs can be collected to form libraries of suitable therapeutics for different RBD variants. Cocktails (ensembles) of peptide therapeutics could be delivered by different means to provide a broadspectrum protection against different SARS-CoV-2 strains. The simulated peptides template-1 (amino acids 21-43 of ACE2) and template-2 (amino acids 19-83 of ACE2) were separately bound to the S protein RBD. Template-1 size is typical of peptides that can be chemically synthesized easily and quickly for experimental testing. Since the simulations and ref. [8] demonstrated that the extracted ACE2-based helix of template-1 loses the curvature and shape complementarity to the RBD binding site, a larger template-2 was also selected for sequence optimization. Template-2 contains two helices that can preserve helix curvature and shape complementarity to the RBD binding site. All structures were directly based on the crystal structure of the human ACE2 protein bound to the SARS-CoV-2 Spike protein RBD (pdbID: 6LZG). [10] The mutations in peptides and RBD were introduced using the psfgen plugin in VMD. [26] The systems were simulated using NAMD2.13 [27] and the CHARMM36 protein force field. [28] The simulations were conducted with the Langevin dynamics ( Lang = 1 ps −1 ) in the NpT ensemble, at a temperature of T = 310 K and a pressure of p = 1 bar. The particle-mesh Ewald (PME) method was used to evaluate Coulomb coupling, with periodic boundary conditions applied. [29] The time step was set to 2 fs. The long range van der Waals and Coulombic coupling were evaluated every 1 and 2 time steps, respectively. After 2000 steps of minimization, the solvent molecules were equilibrated for 1 ns, while the complexes were restrained using harmonic forces with a spring constant of 1 kcal (mol Å) −1 . Next, the systems were equilibrated in 100 ns production MD runs with no restraints. MMGB-SA Calculations: The molecular mechanics generalized Born-surface area (MMGB-SA) method [30, 31] was used to estimate the relative binding free energies between peptides (or ACE2) and RBDs. The free energies were estimated from separate MMGB-SA calculations for three systems (peptide/ACE2, RBD, and the whole complex) in configurations extracted from the MD trajectories of the whole complex in the explicit solvent. The MMGB-SA free energies of the extracted configurations of the three systems were calculated as where E MM , G solv-p , G solv-np , and ΔS conf are the sum of bonded and Lennard-Jones energy terms, the polar contribution to the solvation energy, the nonpolar contribution, and the conformational entropy, respectively. The E MM , G solv-p , and G solv-np terms were calculated using the NAMD 2.13 package [27] generalized Born implicit solvent model, [32] with a dielectric constant of the solvent of 78.5. The G solv-np term for each system configuration was calculated in NAMD as a linear function of the solvent-accessible surface area (SASA), determined using a probe radius of 1.4 Å, as G solv-np = SASA, where = 0.00542 kcal mol −1 Å −2 is the surface tension. The ΔS conf term was neglected, since the entropic contribution differences nearly cancel when considering protein-protein binding of single residue mutants. [33, 34] Moreover, the entropy term, which is often calculated with a large computational cost and low prediction accuracy, is likely to be similar for the studied systems, which differ in single or several mutations. Since the G tot values are obtained for peptide configurations extracted from the trajectories of complexes, G tot does not include the free energies of peptide reorganization; the correct free energies of binding should consider configurations of separately relaxed systems. The approximate binding free energies of the studied complexes were calculated as ⟨ΔG MMGB-SA ⟩ = ⟨G tot (P∕ACE2 − RBD) − G tot (P∕ACE2) − G tot (RBD)⟩, where P/ACE2 represents peptides or complete ACE2, and the ⟨ averaging ⟩ is performed over configurations within the second half of the calculated trajectories. Adaptive Evolution Algorithm: A mutation/selection algorithm was developed and used to iteratively increase the affinity of binding between peptide templates and the S protein RBD. The algorithm involved sequences of steps combining molecular dynamics simulations and MC decisions using the Metropolis criterion, [35, 36] which resulted in MC sampling of the peptide sequence space. Initially, the selected peptide template (directly extracted from ACE2) complexed with RBD was equilibrated for 100 ns. Then, the free energy of binding of the peptide:RBD complex, ΔG before MMGB-SA , was evaluated. Next, a random mutation was introduced at a random position in the peptide, followed by a short 1-10 ns equilibration in MD simulations of the complex, and evaluation of the ΔG after MMGB-SA free energy of binding of the complex. Finally, the mutation was accepted if where r is a random number in the (0,1) interval. If the mutation is accepted, then the new peptide becomes the new reference peptide and its ΔG MMGB-SA becomes the reference value ΔG before MMGB-SA for the next attempted mutation. In each run of the algorithm, 100-200 mutations were attempted on peptide templates, as stated in the results. Proc. Natl. Acad. Sci. USA 2020