key: cord-0770249-wsj732d9 authors: Aghaee, Elham; Ghodrati, Marzieh; Ghasemi, Jahan B. title: In silico exploration of novel protease inhibitors against coronavirus 2019 (COVID-19) date: 2021-01-12 journal: Inform Med Unlocked DOI: 10.1016/j.imu.2021.100516 sha: becf35435a143cd5fdea88caefc8d1d453a73b91 doc_id: 770249 cord_uid: wsj732d9 The spread of SARS-CoV-2 has affected human health globally. Hence, it is necessary to rapidly find the drug-candidates that can be used to treat the infection. Since the main protease (M(pro)) is the key protein in the virus's life cycle, M(pro) is served as one of the critical targets of antiviral treatment. We employed virtual screening tools to search for new inhibitors to accelerate the drug discovery process. The hit compounds were subsequently docked into the active site of SARS-CoV-2 main protease and ranked by their binding energy. Furthermore, in-silico ADME studies were performed to probe for adoption with the standard ranges. Finally, molecular dynamics simulations were applied to study the protein-drug complex's fluctuation over time in an aqueous medium. This study indicates that the interaction energy of the top ten retrieved compounds with COVID-19 main protease is much higher than the interaction energy of some currently in use protease drugs such as ML188, nelfinavir, lopinavir, ritonavir, and α-ketoamide. Among the discovered compounds, Pubchem44326934 showed druglike properties and was further analyzed by MD and MM/PBSA approaches. Besides, the constant binding free energy over MD trajectories suggests a probable drug possessing antiviral properties. MD simulations demonstrate that GLU166 and GLN189 are the most important residues of M(pro,) which interact with inhibitors. The new coronavirus disease is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It is highly contagious and has forced many local and government officials to step in to slow down the rate of infection since the virus was first detected in the December of 2019 in Wuhan, China [1] . The pandemic caused by this virus gave rise to devastating health threats and has created a new front-line for discovering effective drugs and new vaccines. Coronaviruses (CoVs) are a single-stranded positive-sense RNA genome coated with a membrane envelope. This membrane is covered with spike J o u r n a l P r e -p r o o f glycoprotein, giving a crown shape to the virus. The length of RNA genomes is 26-32 kilobases containing 6 to 12 open reading frames (ORFs) [2, 3] . The first twothirds of the whole genome of coronavirus encodes two polyproteins that are divided into 15 or 16 nonstructural proteins denoted as nsp1 to nsp16. The rest of the ORFs holds the genetic codes of four essential structural proteins: envelop (E), membrane (M), nucleocapsid (N), and spike (S) proteins. These proteins play significant roles, from the virus's survival to viral development and viral attack capabilities. [4] [5] [6] [7] . For the virus to enter the cell and cause infection, the S protein has to interact with the host cellular receptors -specifically, the angiotensin-converting enzyme 2 (ACE2) receptors -and have the host translational machinery hijacked [8] [9] [10] [11] . When done, the host cell proteins get exposed to the virus. The ORF lab translates a polyprotein in which this protein breaks into sixteen nsp proteins, namely nsp1 -nsp16 [12, 13] . Among 16 nsp proteins, nsp5 or Main protease (M pro ) is a key coronavirus enzyme for viral replication and transcription and makes it one of the most popular drug targets [14] [15] [16] [17] . M pro inhibitors bind to M pro and prevent this viral replication and block the proteolytic cleavage of protein precursors essential for the infection's inception. Research-based on homology modeling, molecular docking, and binding free energy calculations explored by Xu et al. identified that nelfinavir has a potential inhibitory against COVID-19 main protease [18] . Lopinavir and ritonavir are other protease inhibitors currently being used for treating HIV patients. Since the sequence of SARS-CoV-2 main protease is similar to other CoV proteases such as HIV, China's national health commission has proposed using these drugs for COVID-19, despite there is no official approval of these drugs to treat COVID-19. Kumar and colleagues screened the FDA-approved antiviral drugs through J o u r n a l P r e -p r o o f molecular docking and performed molecular dynamics simulation of the top three selected drugs (lopinavir-ritonavir, tipranavir, and raltegravir) with the main protease of SARS-CoV [19] . They declared that these drugs show the best interaction with M pro and result in a stable complex. Alpha-ketoamides are other inhibitors of the coronavirus main protease. Zhang et al. designed and synthesized α-ketoamides as broad-spectrum inhibitors of the main proteases of coronaviruses. They modified their former best inhibitor to increase the half-life of inhibitor in plasma, increase the solubility, and decrease the inhibitor's binding to plasma proteins. Then they reported the X-ray structure of SARS-CoV-2 M pro in complex with α-ketoamide [20] . By using computer-aided drug design, Jin and colleagues identified the N3 inhibitor and then determined the crystal structure of COVID-19 M pro in complex with this inhibitor. Subsequently, they performed structure-based virtual screening of over 10,000 compounds (approved drugs, drug candidates in clinical trials, and other pharmacologically active compounds as inhibitors of M pro ). They also evaluated the in vitro inhibitory of the top six selected compounds in cell-based assays. They subsequently claimed that these compounds could inhibit the COVID-19 M pro with IC 50 values ranging from 0.67 to 21.4 µM [14] . This work aims to introduce novel Mpro inhibitors by using pharmacophore modeling, virtual screening, molecular docking, and molecular dynamics simulation. A pharmacophore model is a combination of electronic and steric features essential for interacting with a particular receptor to activate or block its biological activity. Virtual screening is a method of discovering new active molecules and is widely used in drug discovery. In this method, large databases of small molecules are screened to find structures with a high affinity toward the target receptor. Molecular docking is a computational J o u r n a l P r e -p r o o f method of identifying the essential interactions between a drug and its receptor. Molecular dynamics (MD) is a computer simulation method for investigating atoms and molecules' physical movements. It can be used for studying the structure of a biomolecular system. For a while, the atoms and molecules are free to move and interact, so a dynamic system is observed. [21] [22] [23] . For performing a quick search of large compound databases, the Pharmit web service (http://pharmit.csb.pitt.edu) was utilized to screen the databases using pharmacophores, molecular shape, and energy minimization. The investigated compounds were transferred into Discovery Studio 4.1 and docked into the pocket of M pro using CDOCKER (CHRMm-based DOCKER) algorithm and ranked by their CDOCKER energy. However, in the docking methods, a rigid protein is considered, and water molecules were deleted. These deficiencies reduce the accuracy of the prediction of docking methods. Molecular dynamics (MD) simulations resolve protein flexibility and solvent problems [24, 25] . But the MD computational calculations are very time-consuming, so these simulations cannot screen large databases, and just a few ligand-protein systems can be studied. Consequently, at first, a fast docking method is performed to search large databases, and later on, more accurate but expensive MD simulations are applied for just a few molecules [26] [27] [28] . Eventually, in silico ADME studies were carried out on deriving molecules to investigate the pharmacokinetic parameters of the discovered compounds. The computational studies were carried out on an Intel Xeon (R) CPU E5-2650 v2 @ 2.60GHz × 32 core, 32GB RAM system. The GPU unit used for molecular modeling and dynamic simulations was Nvidia GeForce GTX (4GB), running on Linux Ubuntu 16.04 LTS. Pharmit website (http://pharmit.csb.pitt.edu) was employed to construct the pharmacophore model using the crystal structure of COVID-19 main protease in complex with the N3 inhibitor (pdb: 6LU7) (http://www.pdb.org). A pharmacophore defines the necessary features of interaction. The coordinates of features were determined by calculating the average of all atoms' coordinates in their SMARTS chemical expression. The Pharmit, searches the selected databases using the Pharmer search method. This search method is similar but different from geometric hashing and the generalized Hough methods (two object recognition methods). An extensive explanation of Pharmer algorithm was given in the article written by David Ryan Koes and Carlos J. Camacho [29] . In this study, 7 features were used to construct the model: four donors, one acceptor, and two hydrophobic features. The shape constraints were also applied to ensure no heavy atom center within the receptor (exclusive shape) and at least one heavy atom center of the hits places within the inclusive shape (ligand). The exclusive shape eliminates molecules following the pharmacophore but has considerable steric conflicts with the receptor [30] . In order to validate the pharmacophore model, a set of decoys were built based on N3 and active inhibitors obtained from previous researches [14, 31] using the DUD.E web tool (http://dude.docking.org/) [32] . which accounts for the flexibility of the target protein's side chains [35] . For evaluation of the trustworthiness of the docking outcomes and the study of the changes of the protein-drug complex over time in an aqueous medium, molecular dynamics simulations (MD) have proceeded with the GROMACS 5.1.2 simulation package [36] . GROMOS96 53a6 force field [37] was applied to simulate the protein-ligand system. The PRODRG 2.5 server was utilized to generate the molecular topology files [38] . The ligand-protein complex structure J o u r n a l P r e -p r o o f was solvated with simple point charge (SPC) water molecules in a dodecahedral box [39] . The ligand-protein complex was centralized in the box with a minimum distance of 1 nm between the complex and the edge of the box. Four Na + ions were added to neutralize the charge of the system. To minimize the system, the steepest descent integrator for 50,000 steps, up to a tolerance of 10 kJ/mol without any constraints, was performed. Afterwards, the system was The g_mmpbsa application was proceeded to calculate the free energy between M pro and the discovered drug. g_mmpbsa is a console application which is executed from terminal/console by command options similar to other GROMACS module [42, 43] . This application was downloaded from https://rashmikumari.github.io/g_mmpbsa/ website. The four crucial subjects in pharmacokinetics are absorption, distribution, metabolism, and excretion (ADME). The compound possessing the best binding energy with the receptor might not be the best drug. A good drug should entirely and quickly absorb from the gastrointestinal tract, explicitly distributed to its target, metabolized in a manner that does not immediately stop its activity, and finally got out without resulting any harm [44] . A relationship between physiological parameters and chemical structures exists; thus, chemical descriptors can be used to calculate pharmacokinetic properties. The blood-brain barrier (BBB) was used to predict the blood-brain penetration after oral consumption. The cytochrome P450 2D6 (CYP2D6) inhibition was measured using 2D chemical structure as input. CYP2D6 conducts the metabolism of a wide range of compounds in the liver. So, inhibition of CYP2D6 activity by a drug creates the problem of a drug interaction. Thus, it is essential to measure the CYP2D6 inhibition as a part of the drug development process. The hepatotoxicity model was performed to predict the potential toxicity of compounds. The plasma protein binding model (PPB) was conducted to predict whether a molecule is probable to bound tightly (≥90% bound) to the plasma carrier proteins. CDOCKER score is the negative value of CDOCKER_ENERGY that a higher value is related to a higher affinity of binding. Thus, the energy can be applied like a score. This score contains internal ligand and receptor-ligand interaction energies. The top-ranked compounds are listed in Table 1 . Table 1 . CDOCKER energy, Gold score, ADME prediction of molecules obtained from virtual screening using a SwissADME website, and b ADMET protocol of Discovery Studio software. Furthermore, docking of some protease inhibitor drugs such as ML188, nelfinavir, lopinavir, ritonavir, and α-ketoamide was conducted for comparison purposes. The 2D interactions of these protease inhibitors are presented in Fig. 5 . It is clearly showed that the majority of interactions are electrostatic and colored in violet. Remdesivir interacts with the viral RNA-dependent RNA polymerase, but some articles stated that remdesivir is a protease inhibitor [46, 47] , so we decided to dock remdesivir into the main protease. The resulting CDOCKER energy and J o u r n a l P r e -p r o o f GOLD score value of remdesivir is very low, demonstrates that remdesivir is not a protease inhibitor. It is interesting to see that our discovered compounds show much more binding energy with COVID-19 M pro (CDOCKER energy of the discovered molecule range from 73.99 to 100.09 kcal/mol, while the CDOCKER energy of the drugs currently in use range from 21.04 to 61.21 kcal/mol). These findings suggest the newly discovered compounds as more potent inhibitors ( Table 2 ). Table 2 . Prediction of ADME properties of some currently in use protease inhibitors of covid19 by a SwissADME website, and b ADMET protocol of Discovery Studio software. For the investigation of the pharmacokinetic parameters of retrieved compounds, logP o/w for octanol/water, aqueous solubility (log S), human oral absorption in the gastrointestinal tract (GI), blood-brain barrier (BBB), skinpermeability coefficient (log Kp), CYP2D6 inhibitor, Hepatotoxicity, and plasma protein binding (PPB) were computed (Table 1 ). An extreme hydrophobic drug reveals low solubility in the gut and solvate in fat globules [48] . Based on solubility level, all compounds have druglikness properties. As shown in Table 1 The resulting hepatotoxicity model predicts the potential toxicity of some of the compounds (Table 1) . Moreover, a drug's efficiency can be affected by binding to the plasma protein as only the free fraction exhibits pharmacological effects. The plasma protein binding model (PPB) predicts that non of the molecules can bound tightly to the plasma carrier proteins. In addition, most orally administered drugs are relatively small molecules with molecular weight less than 500Da. As can be observed, only two compounds have a molecular mass of less than 500 (g/mol). According to the resulting ADME parameters, only Pubchem44326934 has high gastrointestinal absorption (GI), molecular mass less than 500Da, no toxicity, and has druglike properties (based on Lipinski's rule of 5). However, the discovered compounds can be modified to show more druglike properties. J o u r n a l P r e -p r o o f The binding stability of the M pro -drug complex was analyzed through molecular dynamics simulation. The structure of the M pro -drug complex was obtained from molecular docking and was simulated for 100 ns using the Fig. 9 shows five water-mediated H-bonds. Residues with electrostatic and Van der Waals interactions are colored in violet and green, respectively. Two hydrogen bonds can be observed between Glu166 and ligand, while a π-sigma interaction presents between Pubchem44326934 and His41. The MM/PBSA approach was applied to calculate the free energy between M pro and the discovered drug. Fig. 10 In this work, more potent coronavirus protease inhibitors were identified. Based on the crystal structure of COVID-19 M pro with N3 inhibitor, pharmacophore-based virtual screening was conducted. The resulting compounds were docked into the protease pocket and ranked by their interaction energy. In silico ADME studies were performed on docking outcomes to see whether the compounds are suitable to be considered as a drug or not. Pubchem44326934 showed the druglike properties and was further analyzed by MD and MM/PBSA approaches, confirmed that it could be regarded as a new potent COVID-19 M pro inhibitor drug. Future structural-functional studies are recommended to improve the ADME properties of the screened compounds. However, the discovered compounds should be tested, and their in vitro potential should be determined for further validation of the results. Table 2 . Prediction of ADME properties of some potential protease inhibitors of covid19 using a SwissADME website, and b ADMET protocol of Discovery Studio software. Identification of potential molecules against COVID-19 main protease through structure-guided virtual screening approach Systemic in silico screening in drug discovery for Coronavirus Disease (COVID-19) with an online interactive web server Molecular epidemiology, evolution and phylogeny of SARS coronavirus Molecular biology of severe acute respiratory syndrome coronavirus Coronavirus pathogenesis Coronavirus genome structure and replication SARS coronavirus accessory proteins Structural basis for the recognition of the SARS-CoV-2 by full-length human ACE2 Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor The secret life of ACE2 as a receptor for the SARS virus SARS-CoV ORF1b-encoded nonstructural proteins 12-16: replicative enzymes as antiviral targets SARS coronavirus replicase proteins in pathogenesis Structure of M pro from COVID-19 virus and discovery of its inhibitors A mechanistic view of enzyme inhibition and peptide hydrolysis in the active site of the SARS-CoV 3C-like peptidase Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS Autoprocessing mechanism of severe acute respiratory syndrome coronavirus 3C-like protease (SARS-CoV 3CLpro) from its polyproteins nelfinavir was predicted to be a potential inhibitor of 2019-nCov main protease by an integrative approach combining homology modelling, molecular docking and binding free energy calculation In silico prediction of potential inhibitors for the main protease of SARS-CoV-2 using molecular docking and dynamics simulation based drugrepurposing Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors Pharmacophore Identification, Molecular Docking, Virtual Screening, and In Silico ADME Studies of Non-Nucleoside Reverse Transcriptase Inhibitors N-Heterocyclic (4-Phenylpiperazin-1-yl)methanones Derived from Phenoxazine and Phenothiazine as Highly Potent Inhibitors of Tubulin Polymerization Combined docking, molecular dynamics simulations and spectroscopic studies for the rational design of a dipeptide ligand for affinity chromatography separation of human serum albumin Combining docking and molecular dynamic simulations in drug design 3D-QSAR, docking and molecular dynamics for factor Xa inhibitors as anticoagulant agents A combination of docking/dynamics simulations and pharmacophoric modeling to discover new dual c-Src/Abl kinase inhibitors Computational studies of the binding mechanism of calmodulin with chrysin Computational sampling of a cryptic drug binding site in a protein receptor: explicit solvent molecular dynamics and inhibitor docking to p38 map kinase Pharmer: Efficient and Exact Pharmacophore Search Pharmit: interactive exploration of chemical space Ketoamides as Broad-Spectrum Inhibitors of Coronavirus and Enterovirus Replication: Structure-Based Design, Synthesis, and Activity Assessment Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking Detailed Analysis of Grid-Based Molecular Docking: A Case Study of CDOCKER -A CHARMm-Based MD Docking Algorithm Development of accurate binding affinity predictions of novel rennin inhibitors through molecular docking studies GROMACS: fast, flexible, and free A biomolecular force field based on the free enthalpy of hydration and solvation: the GROMOS forcefield parameter sets 53A5 and 53A6 PRODRG -a tool for high-throughput crystallography of protein-ligand complexes Interaction models for water in relation to protein hydration, In: Pullman B (ed) Intermolecular forces. The Jerusalem Symposia on Quantum Chemistry and Biochemistry LINCS: a linear constraint solver for molecular simulations Particle Mesh Ewald-an N.Log(N) method for Ewald sums in large systems g_mmpbsa -A GROMACS tool for high-throughput MM-PBSA calculations Electrostatics of nanosystems: application to microtubules and the ribosome ADMET-turning chemicals into drugs GS-5734) as a therapeutic option of 2019-nCOV main protease -in silico approach Binding site analysis of potential protease inhibitors of COVID-19 using AutoDock Polar Molecular Surface Properties Predict the Intestinal Absorption of Drugs in Humans