key: cord-342557-a7q8vp8m authors: Chowdhury, Surid Mohammad; Talukder, Shafi Ahmad; Khan, Akib Mahmud; Afrin, Nadia; Ali, Md Ackas; Islam, Rajib; Parves, Rimon; Al Mamun, Abdulla; Sufian, Md. Abu; Hossain, Md Nayeem; Hossain, Mohammed Akhter; Halim, Mohammad A. title: Antiviral Peptides as Promising Therapeutics against SARS-CoV-2 date: 2020-10-23 journal: J Phys Chem B DOI: 10.1021/acs.jpcb.0c05621 sha: doc_id: 342557 cord_uid: a7q8vp8m [Image: see text] Over 50 peptides, which were known to inhibit SARS-CoV-1, were computationally screened against the receptor-binding domain (RBD) of the spike protein of SARS-CoV-2. Based on the binding affinity and interaction, 15 peptides were selected, which showed higher affinity compared to the α-helix of the human ACE2 receptor. Molecular dynamics simulation demonstrated that two peptides, S2P25 and S2P26, were the most promising candidates, which could potentially block the entry of SARS-CoV-2. Tyr489 and Tyr505 residues present in the “finger-like” projections of the RBD were found to be critical for peptide interaction. Hydrogen bonding and hydrophobic interactions played important roles in prompting peptide–protein binding and interaction. Structure–activity relationship indicated that peptides containing aromatic (Tyr and Phe), nonpolar (Pro, Gly, Leu, and Ala), and polar (Asn, Gln, and Cys) residues were the most significant contributors. These findings can facilitate the rational design of selective peptide inhibitors targeting the spike protein of SARS-CoV-2. A new type of coronavirus was first detected in December 2019 at Wuhan city, the capital of the Hubei province of China. This virus is designated as severe acute respiratory syndrome-related coronavirus-2 (SARS-CoV-2). 1 The pneumonia-like disease caused by the virus is globally known as COVID-19. Apart from China, COVID-19 has spread to 213 countries and killed over 441,000 people in total as of today (17 June, 2020) . With the case count and death toll rising each day, there is an urgent need for antiviral drugs or vaccines against SARS-CoV-2. The SARS-CoV-2 is a positive-sense single-stranded RNA virus. It is a member of the same family belonging to SARS-CoV and Middle East respiratory syndrome (MERS-CoV). The SARS-CoV-2 virion has a diameter of 50−200 nm. 2 Like other coronaviruses, SARS-CoV-2 has four structural and many nonstructural proteins. The structural proteins are called spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins. S, E, and M proteins perform together to form the viral envelope. 3 The spike protein has a crown-like (corona) appearance. The spike (S) protein allows the virus to be attached into the host surface by interacting with human angiotensin-converting enzyme-2 (hACE2) receptors present in the upper and lower respiratory system. 4, 5 hACE2 receptors are expressed in many organs including the lung, small intestine, testis, and kidney. ACE2, which acts as an exopeptidase, catalyzes the conversion of angiotensin I to angiotensin I−IX and angiotensin II to angiotensin I−VII. 6−8 Cryo-electron microscopy analysis has indicated that unlike other coronaviruses, the S protein of SARS-CoV-2 has 10−20 times greater affinity to the hACE2 receptors, resulting in greater transmissibility than others. 9,10 Upon performing sequence alignment and homology modeling, it is evident that the S protein of SARS-CoV and SARS-CoV-2 share 76% sequence identity. 10, 11 The S protein comprises S1 and S2 domains. The S1 domain is responsible for binding to ACE2 receptors via its receptorbinding domain (RBD), whereas the S2 domain performs the fusion, enabling viral genome entry. 12 Electron microscope imaging revealed that the S glycoprotein forms a clove-shaped spike with three S1 heads and a S2 trimeric stalk. 13 The RBD has greater variability. Six amino acids (L455, F486, Q493, S494, N501, and Y505) in the RBD are extensively responsible for the efficient binding. 12 The S protein of SARS-CoV-2 is glycosylated containing 22 predicted N-glycosylation in the sequence, having one site less than the SARS-CoV-1 at N370. 13 The RBD of SARS-CoV-2 shares 72% sequence identity with that of SARS-CoV at the protein level. 13 The interaction between the S protein and the ACE2 receptor is the critical route of entry for the virus. Therefore, the S protein is a potential target for drug or vaccine development. Small molecules or peptides can be designed as therapeutics that will disrupt the interaction between the S protein and the ACE receptor; however, small molecules are not ideal for targeting large protein−protein interactions (PPIs). Peptides, on the other hand, can disrupt the PPIs effectively as they possess a larger surface compared to small molecules and thus specifically bind to the interface-binding region. 14 In this context, a team from the Massachusetts Institute of Technology (MIT) developed 23-mer peptide against the spike protein. 15 A research group from the University of Illinois at Chicago designed four ACE2-based peptide inhibitors of SARS-CoV-2. 2 While this early stage of peptide inhibitor development shows great promise, only few ACE-2-based peptides were screened and proposed. In this work, we, therefore, computationally screened 51 antiviral peptides that were known to work against SARS-CoV-1, targeting the RBD of the S protein of SARS-CoV-2. Peptides that showed higher S protein-binding affinity compared to the α-helix (AH) of the ACE2 peptidase were further analyzed with molecular dynamics (MD) simulation and the structure− activity relationship (SAR) in order to achieve a high-affinity binder for the S protein. ■ METHODS Molecular Docking. A total of 51 peptides were selected from the antiviral peptide database AVPdb, 16 which were experimentally verified to be effective against the SARS-CoV-1. All the peptides were modeled by the CABS-Fold. 17 The crystal structure 6M0J of the RBD was retrieved from the RCSB Protein Data Bank (PDB). 18 Peptides were docked to the RBD using Patchdock, 19 and initial 1000 peptide−RBD complexes obtained from PatchDock were then refined by FireDock. 20 Peptides were further docked using ClusPro 21 and HADDOCK 2.4, 22 with an aim to reach a consensus score. Peptides that exhibited better binding scores in all three docking modes were subsequently analyzed by MD simulation. MD Simulations. A 150 ns MD simulation was conducted for Apo-RBD, AH-RBD, S2P1-RBD, S2P3-RBD, S2P25-RBD, S2P26-RBD, S2P28-RBD, and S2P30-RBD complexes to evaluate the peptide−protein conformational dynamics and interaction. MD simulation was performed three times for each case. YASARA Dynamics software was used, and AMBER 14 force field was considered for all calculations. 23, 24 Water molecules (0.998 g/cm 3 density) were added, and the system was neutralized by adding NaCl salt at 0.9% concentration at 310 K temperature. The particle-mesh Ewald method 25 was used for long-range electrostatic interaction calculation. A Berendsen thermostat was used to control the simulation temperature. Periodic boundary condition was employed for performing the simulation, and the cell size was 20 Å larger than the protein−peptide complex in all cases. A simulated annealing method was used for the initial energy minimization process of each simulation system, using the steepest gradient approach (5000 cycles). A 1.25 fs time step was used for the overall simulations. Finally, 150 ns MD simulation was performed for each system, and the snapshots were saved at every 100 ps. Bond distance, bond angle, dihedral angles, solvent-accessible surface area (SASA), Coulombic and van der Waals (vdW) interactions, root-mean-square-deviation (rmsd), root-mean-square-fluctuation (RMSF), and values for backbone, alpha carbon, and heavy atoms were analyzed from MD simulation. MD snapshots were collected to evaluate the interactions in peptide−protein complexes over 150 ns. A total of 150 MD snapshots were selected for the binding free energy calculations by PRODIGY server, 26 which measures the free energy based on intermolecular contacts and properties derived from the noninterface surface. Different multivariate energy factors were analyzed by employing the principle component analysis (PCA) method to understand the structural and energetic changes of proteins in the presence of the peptide during MD simulation. The structural and energy information including bond distances, bond angles, dihedral angles, planarity, vdW energies, and electrostatic energies were considered. PCA analysis can disclose the hidden structural and energy profile among different groups. 27, 28 The last 50 ns of the MD trajectory data for both Apo-RBD and peptide−protein complexes were considered for PCA analysis. Data were preprocessed using centering and scaling prior to this analysis. The multivariate factors were arranged in the X matrix and reduced into a product of two new matrices by using the following equation. Here, T k is the matrix of scores, which signifies the relation of samples with each other, P k is the matrix of loadings, carrying information about the relation of variables to each other, k is the number of factors into the model, and E is the unmodeled variance. For performing all the calculations, R 29,30 -based inhouse-developed codes were used. Peptide SAR Analysis. Peptide SAR was performed considering the best 15 peptides. Relevant peptide properties including acidic (A), basic (B), aromatic (AR), polar (P), nonpolar (NP) amino acids, net charge at pH 7, molecular weight, and approximate volume (Table S5 ) are calculated using the ProtParam tool. 30 Initially, stepwise multiple linear regression (MLR) was performed considering these properties as variables to predict the calculated binding affinity of the test peptides with the RBD of the SARS CoV-2 spike protein. Subsequently, PCA was performed taking the five most important peptide properties to cluster the test peptides in a biplot to explore the structural variance. Peptide-Binding Affinity and Interaction. The amino acid sequence, length, and in vitro inhibition efficiency (these data are collected from the peptide database AVPdb 16 ) against SARS-CoV-1 of 51 peptides are summarized in Table S1 . All 51 peptides were docked to the RBD of the SARS CoV-2 spike protein using PatchDock. The binding pockets of the RBD were specified during the molecular docking. The best 1000 peptide−protein complexes obtained from PatchDock were submitted to FireDock for subsequent refinement. Only the complexes that showed the higher binding affinity and the expected binding interaction were chosen as the best candidates ( Table 1 ). The AH of the ACE2 peptidase domain (PD) is considered as a control peptide. The binding affinity of all peptides is tabulated in Out of 51 peptides, 27 peptides showed satisfactory binding interaction when docked using Cluspro2.0 (Table S2) . Moreover, strong binding affinity was also observed for S2P3 and S2P26, which agreed with FireDock results. Although S2P5, S2P10, S2P35, and S2P49 exhibited better affinity, they shifted away from the binding pocket. In HADDOCK results, nine peptides displayed more favorable docking scores, namely, S2P1, S2P3, S2P9, S2P25, S2P26, S2P28, S2P30, S2P34, and S2P43. However, none of these peptides exceeded the control AH in terms of binding affinity (Table S2) . Various residues including Glu484, Tyr449, and Tyr505 present in the ACE2 binding site of the RBD were involved in noncovalent interaction with the antiviral peptides ( Figure 1a) . Notably, Glu484 and Tyr449 exhibited multiple interactions with several antiviral peptides, indicating that these residues might be crucial for attachment with peptides ( Figure 1c) . Other important residues that also interacted with the antiviral peptides are Gln493, Leu455, Tyr453, and Tyr489. Hydrogen bonding played a crucial role in peptide−RBD interaction, contributing 53% of all interactions (Figure 1b) . Besides hydrogen bonding, hydrophobic interactions contributed to 42%, while electrostatic interactions were involved in only 5% of the total interactions. MD Simulation. MD simulations of Apo-RBD and complexes of AH, S2P1, S2P3, S2P25, S2P26, S2P28, and S2P30 were performed. S2P1 and S2P3 showed significant changes in rmsd (Figure 2a) . When MD snapshots were analyzed, it became clear that such changes in rmsd were not due to the change in protein conformation, rather it could be attributed to the movement of the peptide. Both peptides, S2P1 and S2P3, were found to be deviated from their binding interface, although S2P1 was more deviated than S2P3. The AH-RBD complex remained stable over the simulation period, as indicated by its rmsd profile and respective snapshots (Figure 3a ). Although S2P25, S2P26, S2P28, and S2P30 complexes exhibited a slightly greater rmsd, their respective MD snapshots (Figure 3b ,c) revealed that these peptides occupied the binding interface and remained as stable complexes throughout the simulation period. The S2P28-RBD complex exhibited lower radius of gyration (Rg) values compared to the AH-RBD complex, indicating that this peptide induced the compactness in the RBD upon binding (Figure 2b ). Overall, a trend of reduction in the SASA was detected for all complexes (Figure 2c) . Nonetheless, the most prominent downgrading in the SASA was observed in the case of S2P28-RBD complexes, thus confirming the induction of (Figure 2d) . A high RMSF value illustrated the S2P1 displacement from its binding site. However, high fluctuations were observed in regions spanning residue numbers 474−487 and 517−521 in the RBD for all complexes. This is not unexpected because these regions correspond to loops which lack any definite geometry. Binding free energies of AH-RBD, S2P25-RBD, S2P26-RBD, S2P28-RBD, and S2P30-RBD complexes were calculated, for which AH showed an average binding energy of −11.13 ± 0.03 kcal/ mol which was the highest compared to other peptides ( Figure 4a ). The average binding affinities of S2P25 and S2P26 were found to be better than those of S2P28 and S2P30. Overall, The Journal of Physical Chemistry B pubs.acs.org/JPCB Article MD simulation suggests that S2P25 and S2P26 could be our potential candidates. A PCA model including eight training sets (Apo-RBD and seven peptide−RBD complexes) is generated to understand structural and energy changes in the peptide−protein complexes relative to Apo-RBD during MD simulation. Here, the first two PCs explain 93.1% of variance, where PC1 explains 68% and PC2 explains 25.1% of variance. In the score plot of PC1 and PC2 (Figure 4c ), Apo-RBD shows a major rightward shift relative to all the peptide−protein complexes along PC1. This clustering pattern is usual because majority of the variables, that is, Coulomb energy, angle, bond distance, and vdW energies (Figure 4d ) have largely influenced the variance along PC1. The RBD-S2P28 complex is at the farthest Interaction of S2P26, S2P25, and AH with the RBD Obtained from MD Simulation. In the S2P26-RBD complex, Tyr505 and Tyr489 in the RBD exhibited remarkable interactions over the 150 ns simulation period (Figure 5a ), whereas Asn487, Leu455, Tyr473, Gln493, and Phe456 also interacted frequently. Most notable residues found in the S2P26 peptide were Pro7, Tyr21, Cys4, and Tyr5 (Figure 5b) . Interaction between S2P26 and the RBD was elevated by both hydrogen bonding and hydrophobic interactions (Figure 5c ). Tyr505 in the RBD formed hydrophobic interaction and hydrogen bonds with Gly20 and Tyr21 in S2P26 (Figure 5d ). However, Tyr489 showed hydrophobic interaction only with Tyr5 and Pro7 in the S2P26 peptide. Hydrophobic interaction (49%) and hydrogen bonding (47%) contributed to most of the interactions between S2P26 and the RBD. In the S2P25-RBD complex, Ala475 and Arg403 present in the RBD were involved in multiple interactions during the simulation period ( Figure S1a ). Other residues such as Tyr489, Tyr473, Tyr505, and Phe456 were also detected. In the S2P25 peptide, Tyr29, Cys10, His13, Cys12, and Phe1 residues were involved in such interactions ( Figure S1b ). Figure S1c illustrates that the interactions between S2P25 and the RBD were mostly governed by hydrogen bonds covering around 52% of the total interactions. Arg403 in the RBD showed major interaction with Glu18 in S2P25 through electrostatic and hydrogen bonding ( Figure S1d ). In the AH-RBD complex, Arg403 and Lys417 residues in the RBD showed significant interactions during the 150 ns MD simulation. The RBD residue Arg403 interacted with the Glu18 in AH through electrostatic and hydrogen bonding, whereas Lys417 interacted with Asp11 and His15 by electrostatic, hydrophobic, and hydrogen bonding ( Figure S2 ). MD simulation results suggested that the Arg403 and Lys417 in the RBD were essential for strong binding of ACE2 with the RBD of the spike protein. Besides these residues, Asn501, Gln493, Tyr505, Tyr489, Gln498, Gly496, and Leu455 were involved in interaction with AH residues including Glu18, His15, Gln23, Lys12, Gln5, Tyr22, and Asp11 ( Figure S2 ). Overall, Tyr489 and Tyr505 residues in the RBD commonly participated in all stable peptide−protein complexes ( Figure 4b ). These residues are present in the "finger-like" projections of the RBD (involved in ACE2 receptor binding), which suggests that these projections and specifically these residues are crucial in peptide−RBD binding. Structure−Activity Relationship. MLR analysis is executed with the most relevant peptide properties to sort out the significant predictors of the binding affinity of the test peptides (Table S3) . Aromatic, nonpolar, and polar residues are found to be the most significant predictors, which explains the observed dominance of hydrogen and hydrophobic interactions (95% in total) of the peptide−RBD complexes. In other words, tyrosine and polar residues stabilize the peptide−protein complexes by hydrogen bonding interactions, whereas nonpolar and other aromatic residues stabilize the complexes by hydrophobic interactions. This regression model also holds true in the peptide−RBD dynamic interactions in 150 ns MD simulation as 10 high-frequency residues of the best-performing peptides (S2P25 and S2P26) in contact with the RBD are predominantly aromatic, nonpolar, or polar (Table S4 ). In addition, the clustering behavior of the top 15 peptides based on the most significant five peptide properties are analyzed to get an insight into their structural variance. In the generated biplot, the clustering pattern of S2P1, S2P3, S2P25, S2P26, and AH replicated the energy score plot of their complexes with the RBD, except that S2P26 is close to AH ( Figure 6 ). Besides, nonpolar, polar, and aromatic residues play a significant role in the clustering pattern of the test peptides as nonpolar and polar residues are heavily loaded onto PC1 and the aromatic residue onto PC2, which altogether explains 76.34% structural variance. Designing and developing high-affinity antiviral peptides represent a promising therapeutic strategy for COVID-19 treatment. Encompassing the extended protein contact interface, high-affinity antiviral peptides can strongly inhibit the RBD of the spike protein, thus blocking the SARS-CoV-2 from entering cells and subsequent replication. Although ACE2based peptide inhibitors are suggested, our results demonstrated that over 15 peptides can show better binding affinity than the AH of ACE2. Details from MD simulation indicate that S2P25 and S2P26 could be the most promising antiviral peptide for SARS-CoV-2. Some critical residues of both the RBD and peptides are also observed by analyzing the residuespecific contact maps of these peptides. SAR reveals that by combing aromatic, polar, and nonpolar residues, one can further optimize these peptides to improve their binding affinity for the S protein. We anticipate that these peptides can serve as the next-generation antiviral therapeutics for the treatment of the COVID-19 disease. In addition, these antiviral peptides can be conjugated to gold nanoparticles that are expected to act as potent nanoinhibitors enhancing the antiviral activity. Our study provides valuable information for the rational design and development of peptide inhibitors against SARS-CoV-2 that can show high in vitro and in vivo efficacy. The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.0c05621. Sequence, length, inhibition efficiency, binding affinity, stepwise MLR analysis, interaction residues, and distribution of noncovalent interactions (PDF) Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. The Species Severe Acute Respiratory Syndrome-Related Coronavirus: Classifying 2019-NCoV and Naming It SARS-CoV-2 Computational Design of ACE2-Based Peptide Inhibitors of SARS-CoV-2 Analysis of Therapeutic Targets for SARS-CoV-2 and Discovery of Potential Drugs by Computational Methods The spike protein of SARS-CoV -a target for vaccine and therapeutic development Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Structure, Function, and Evolution of Coronavirus Spike Proteins Angiotensin-Converting Enzyme 2 Is a Functional Receptor for the SARS Coronavirus Expression and Functional Characterization Rare Driver Mutations in Head and Neck Squamous Cell Carcinomas Converge on NOTCH Signaling Genomic Characterisation and Epidemiology of 2019 Novel Coronavirus: Implications for Virus Origins and Receptor Binding Learning from the Past: Possible Urgent Prevention and Treatment Options for Severe Acute Respiratory Infections Caused by 2019-nCoV The Proximal Origin of SARS-CoV-2 Structure Analysis of the Receptor Binding of 2019-NCoV Future Directions for Peptide Therapeutics Development. Drug Discovery Today The First-in-Class Peptide Binder to the SARS-CoV-2 Spike Protein AVPdb: A Database of Experimentally Validated Antiviral Peptides Targeting Medically Important Viruses CABS-Fold: Server for the de Novo and Consensus-Based Prediction of Protein Structure Structural Basis of Receptor Recognition by SARS-CoV-2 PatchDock and SymmDock: Servers for Rigid and Symmetric Docking Fast Interaction Refinement in Molecular Docking The ClusPro web server for proteinprotein docking The HADDOCK2.2 Web Server: User-Friendly Integrative Modeling of Biomolecular Complexes Making Optimal Use of Empirical Energy Functions: Force-Field Parameterization in Crystal Space Ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from Ff99SB Particle Mesh Ewald: An N· log(N) Method for Ewald Sums in Large Systems PRODIGY: A Web Server for Predicting the Binding Affinity of Protein−Protein Complexes A Molecular Modeling Approach to Identify Effective Antiviral Phytochemicals against the Main Protease of SARS-CoV-2 Prediction of Deleterious Non-Synonymous SNPs of Human STK11 Gene by Combining Algorithms Protein Identification and Analysis Tools on the ExPASy Server. The Proteomics Protocols Handbook The authors declare no competing financial interest. We are grateful to our donors who supported to build a computational platform (http://grc-bd.org/donate/). The authors like to acknowledge the World Academy of Science (TWAS) to purchase the high-performance computer for performing MD simulation. The Journal of Physical Chemistry B pubs.acs.org/JPCB Article