key: cord-311415-wwwqqvca authors: Alamri, Mubarak A.; Tahir ul Qamar, Muhammad; Mirza, Muhammad Usman; Bhadane, Rajendra; Alqahtani, Safar M.; Muneer, Iqra; Froeyen, Matheus; Salo-Ahen, Outi M. H. title: Pharmacoinformatics and molecular dynamics simulation studies reveal potential covalent and FDA-approved inhibitors of SARS-CoV-2 main protease 3CL(pro) date: 2020-06-24 journal: J Biomol Struct Dyn DOI: 10.1080/07391102.2020.1782768 sha: doc_id: 311415 cord_uid: wwwqqvca The SARS-CoV-2 was confirmed to cause the global pandemic of coronavirus disease 2019 (COVID-19). The 3-chymotrypsin-like protease (3CLpro), an essential enzyme for viral replication, is a valid target to combat SARS-CoV and MERS-CoV. In this work, we present a structure-based study to identify potential covalent inhibitors containing a variety of chemical warheads. The targeted Asinex Focused Covalent (AFCL) library was screened based on different reaction types and potential covalent inhibitors were identified. In addition, we screened FDA-approved protease inhibitors to find candidates to be repurposed against SARS-CoV-2 3CLpro. A number of compounds with significant covalent docking scores were identified. These compounds were able to establish a covalent bond (C–S) with the reactive thiol group of Cys145 and to form favorable interactions with residues lining the substrate-binding site. Moreover, paritaprevir and simeprevir from FDA-approved protease inhibitors were identified as potential inhibitors of SARS-CoV-2 3CLpro. The mechanism and dynamic stability of binding between the identified compounds and SARS-CoV-2 3CLpro were characterized by molecular dynamics (MD) simulations. The identified compounds are potential inhibitors worthy of further development as COVID-19 drugs. Importantly, the identified FDA-approved anti-hepatitis-C virus (HCV) drugs paritaprevir and simeprevir could be ready for clinical trials to treat infected patients and help curb COVID-19. Communicated by Ramaswamy H. Sarma In the last two decades, several pathogens have spilled over and caused outbreaks. Among them, emergence and reemergence of coronavirus-related epidemics have widely spread fatal respiratory illnesses . Coronaviruses are enveloped RNA viruses that are distributed broadly among humans, other mammals, and birds, and cause respiratory, enteric, hepatic, and neurologic diseases (Weiss & Leibowitz, 2011) . Six coronavirus strains are known to cause human diseases. Four strains (229E, OC43, NL63, and HKU1) are prevalent and typically cause common cold symptoms in immunocompetent individuals. The two other strains (severe acute respiratory syndrome coronavirus, SARS-CoV, and Middle East respiratory syndrome coronavirus, MERS-CoV) are zoonotic in origin and have been linked to fatal illnesses (Cui et al., 2019; Su et al., 2016) . Recently, severe respiratory illness outbreak was reported. Follow-up studies found that this illness is linked with a new highly infectious strain of coronavirus SARS-CoV-2, and term it as coronavirus disease 2019 (COVID-19) Zhou et al., 2020; . COVID-19 has quickly turned into a global pandemic, affected millions of people, and claimed tens of thousands of human lives worldwide. Currently, there is no approved drug or vaccine to combat COVID-19 Tahir ul Qamar, Rehman, et al., 2020) . Several antiviral drugs are in clinical trials and only remdesivir appeared to be the most promising (Cao et al., 2020) which is being tested in patients with severe disease (Cao et al., 2020; Hillaker et al., 2020) . Later investigations have revealed that the SARS-CoV-2 belongs to the beta-corona-virus family and it is closely related to SARS-CoV (Tahir ul Qamar, Alqahtani, et al., 2020; Xu et al., 2020; Zhou et al., 2020) . Similar to other beta-corona-viruses, SARS-CoV-2 produces an 800-kDa polypeptide upon transcription of its genome . This polypeptide is proteolytically cleaved to generate various proteins including Spike (S) protein, envelope (E) protein, membrane (M) protein and nucleocapsid (N) protein . S protein is involved in viruscell receptor binding, while the other structural proteins E, M and N are crucial for virion assembly Zumla et al., 2016) . The proteolytic processing is mediated by papain-like protease (PL pro ) and 3-chymotrypsin-like protease (3CL pro ). 3CL pro (also termed as the main protease, M pro ) cleaves the polyprotein at 11 distinct sites to generate many of the non-structural proteins, including RNA-dependent RNA polymerase (RdRp) and helicase (Hel), which are important in viral transcription and replication (Zumla et al., 2016) . Thus, this main protease plays a critical role in transcription and replication of the virus (Shirato et al., 2013; Zumla et al., 2016) . Structure-based activity studies and various highthroughput studies have identified distinct inhibitors of SARS-CoV and MERS-CoV main proteases (Anand et al., 2003; Bacha et al., 2004; Kumar et al., 2017; Ryu et al., 2010) . Therefore, it is crucial to identify novel inhibitors of SARS-CoV-2 main protease to control COVID-19 (Farag et al., 2020; Haider et al., 2020; Wang, 2020) . On the other hand, main protease is highly conserved across coronaviruses. Therefore, it is a potential target for identification of compounds that could have broad spectrum anti-viral activity (McInnes, 2007; Mirza & Froeyen, 2020) . Covalent inhibition of therapeutic protein targets is becoming popular. Nearly 30% of the marketed drugs in the last two decades target enzymes by covalent binding mechanism (De Cesco et al., 2017; Lagoutte et al., 2017) . Covalent inhibitors form a chemical bond between an electrophilic group of the ligand and a nucleophilic residue (e.g. Cys) on the target protein (Mah et al., 2014) . The mechanism of action of covalent ligands is known to have two-steps. The mechanism initiates by the formation of a non-covalent complex essential to orient the electrophilic group in favor of bond formation with the nucleophilic residue of the target protein. The second step involves the formation of the covalent bond (Mah et al., 2014; Mukherjee & Grimster, 2018) . The nature of the electrophile dictates if covalent inhibition is reversible or irreversible. Irreversible inhibition caused by covalent bond formation results in a remarkably different profile as compared to the reversible counterpart (De Cesco et al., 2017) . Cysteine moieties are present in a range of diverse proteins such as proteases, oxidoreductases, and kinases and thus, take part in a variety of functions (Go et al., 2015) . Cysteine thiol is a highly reactive moiety due to its high electron density and polarizability and can therefore, be targeted with less reactive ligands allowing to minimize potential side effects (Flanagan et al., 2014; Poole, 2015; Schultz et al., 2006) . Hence, cysteine-targeted electrophiles can be employed to understand their effect on a wide range of proteins. For example, cysteine proteases including Cathepsin B, K, and S have been targeted by covalent inhibitors (vinyl sulfones, epoxides, isothiazolones, and ketoamides, as well as other chemical groups) (Toledo Warshaviak et al., 2014) . Hepatitis C virus (HCV) protease is also targeted covalently by the ketoamide groups of boceprevir (Rotella, 2013) and telaprevir (Kwong et al., 2011) . In this work, given the importance of cysteine targeting covalent inhibitors, we aimed to identify cysteine thiol targeting warheads (reactive groups in compounds). Specifically, this study is focused on targeting Cys145 of the catalytic dyad of SARS-CoV-2 main protease. Traditional methods for identification of inhibitors are expensive and time-consuming (Hou & Xu, 2004; Kolb & Sharpless, 2003; Mirza et al., 2019; Tahir ul Qamar et al., 2016; Xu, 2006) . Therefore, the use of in silico techniques for identification of potential inhibitors has gained importance in recent years (Ece, 2020; Er et al., 2018; Mirza et al., 2019; Tahir ul Qamar et al., 2017; Tahir ul Qamar et al., 2019; . The available small molecule databases can be utilized for structure and ligand-based virtual screening to identify novel scaffolds that could serve as hits or leads to be optimized for enhanced activity (McInnes, 2007; Mumtaz et al., 2017) . In this contribution, combined virtual screening approaches were used to identify potential covalent inhibitors of SARS-CoV-2 M pro from specific databases consisting of a variety of chemical warheads. In addition, FDA approved protease inhibitors were screened for the purpose of drug repurposing. Molecular dynamics (MD) simulations were utilized to investigate the binding mode of the potential inhibitors at the active site of SARS-CoV-2 main protease. The here discovered virtual hits could serve as a starting point to develop future drug candidates. Furthermore, the FDA approved protease inhibitors identified in this study warrant further in vitro evaluations. A multiple sequence and structure alignment analysis was carried out to find out the evolutionarily conserved functional residues among SARS-CoV-2, SARS-CoV and MERS-CoV that could be further targeted for the discovery of drug hits. Sequences and 3D structures of SARS-CoV-2 (PDB ID 6LU7) , SARS-CoV (PDB ID 2A5I) (Lee et al., 2005) and MERS-CoV (PDB ID 5WKK) (Galasiti Kankanamalage et al., 2018) main proteases were retrieved from the Protein Data Bank (PDB) (Berman et al., 2003) . The 3CL pro sequences were aligned using Clustal Omega (Sievers et al., 2011) . To ensure broad-spectrum relevance of these protein targets, the conserved functional residues within their active sites were analyzed through structural alignment as well. Structural alignment/superposition analysis was carried out using the Pymol Molecular Graphics System v1.3 program (Seeliger & de Groot, 2010) , while the interactive protein/ligand snapshots were generated with the UCSF Chimera v1.14 (Pettersen et al., 2004) . Two chemical libraries were used in this study, the commercially available Asinex Focused Covalent (AFCL) library that consists of 1000 molecules (http://www.asinex.com/) and the FDA-approved protease inhibitor library that includes in total 116 anti-viral compounds. The structures of these 116 inhibitors were downloaded individually from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/). The AFCL database was used to identify potential compounds bearing reactive warheads, while the FDA-approved protease inhibitor database was used to search for existing anti-viral compounds that could be repurposed against SARS-CoV-2 main protease. Since the active site of SARS-CoV-2 main protease contains a catalytic cysteine, it is possible to target it with covalently binding compounds. For example, in the crystal structure of SARS-CoV-2 main protease (PDB ID 6LU7), the N3 peptide (a designed Michael acceptor inhibitor) is covalently attached to the Sc atom of C145. The Michael addition reaction has occurred with the Cb of the N3 vinyl group. Thus, we carried out a covalent docking protocol to screen the AFCL library, using Schr€ odinger's Covalent Docking (CovDock) tool as implemented in the Maestro molecular modelling package (release 2019-4; Schr€ odinger, LLC, New York, NY, 2019). In addition to Michael addition, we also explored the other reaction types implemented in CovDock, such as nucleophilic addition to double bond or triple bond. Protein Preparation Wizard (Sastry et al., 2013) of Maestro was used for preparing the target structure of SARS-CoV-2 main protease (PDB ID 6LU7). The missing hydrogen atoms were added, and the hydrogen bond network was optimized with PROPKA at pH 7.0. The protonation states of histidine residues at the protease active site were optimized interactively. The catalytic His41 and His164 were set as d-protonated (HID) while His163 was set as e protonated (HIE) and His172 as double protonated (HIP). In the active conformation of the substrate-binding site, His172, rather than His163 has been proposed to have a salt bridge interaction with Glu166 (Tan et al., 2005) , which is consistent with the target's active crystal conformation. The covalently bound N3 peptide inhibitor was removed from the protease active site and its structure was remodeled to the unreacted state and used in the validation docking employing the Michael addition reaction. Moreover, all water molecules were removed, and a restrained minimization was carried out using the newly optimized OPLS3e force field (Roos et al., 2019) and the convergence criteria of 0.3 Å root-mean-square deviation (RMSD) for all heavy atoms. The AFCL library was processed with the LigPrep tool of Maestro (release 2019-4). The missing hydrogen atoms were added and different tautomeric states for each ligand were generated at pH of 7.0 ± 2.0 with Epik (Shelley et al., 2007) . Maximum of two alternative stereoisomers per ligand were generated whilst retaining the specified chiralities. Finally, the OPLS3e force field was used to generate optimized low-energy 3D conformers of the ligands. The center of the docking site was defined at the centroid of the catalytic Cys145 residue and the box size was set for ligands with a length of 20 Å. For the N3 peptide re-docking, the box centroid was defined by four residues: Cys145, His163, Pro168 and Gln189 (the grid center xyz coordinates were either [À14.142582, 19.133875, 64.827842] or [À13.805531, 12.077630, 68.251552] , depending on in which order the residues were picked). At first, all the library compounds were docked using the Michael addition reaction and the fast-virtual screening mode of CovDock to investigate the feasibility of forming a covalent interaction with Cys145. All compounds that successfully formed a covalent complex with the protease (as well as the N3 peptide) were then docked using the more thorough pose prediction mode. The cut-off to retain poses for further refinement was set at the docking score of 2.5 kcal/mol and maximum number of 100 poses (200 or 300 for the N3 peptide). The poses were then ranked using the Prime (Jacobson et al., 2004) MM-GBSA score in the OPLS3e force field. The Prime/MM-GBSA (molecular mechanics-generalized Born surface area) calculation employs the variable dielectric generalized Born solvation model (VSGB 2.1) . The MM-GBSA binding free energy calculation does not take into account covalent binding and thus, the CovDock tool treats the compounds as non-covalent (capped) binders during the calculation. The above steps were then repeated for other reaction types implemented in CovDock (nucleophilic addition to a double and a triple bond, nucleophilic substitution, aryl and nitrile activated conjugate addition to alkyne). The top-ranking compounds were selected for further validation by MD simulations. Molecular (non-covalent) docking-based virtual screening was performed to search for potential inhibitors from the FDA-approved protease inhibitors library using AutoDock Vina v1.1.2 (Trott & Olson, 2010) in PyRx 0.8 virtual screening tool (Dallakyan & Olson, 2015) . Initially, the compounds were imported into the Open Babel tool (O'Boyle et al., 2011) implemented in PyRx 0.8 for energy minimization using the MMFF94 force field (Halgren, 1996a (Halgren, , 1996b . The same program was utilized to convert the compounds from the SDF format to the AutoDock Vina PDBQT format. The screening was carried out against the active site of SARS-CoV-2 (PDB ID 6LU7) in which the grid box was defined to cover the catalytic dyad residues His41 and Cys145 and other essential residues within the binding pocket. The protein was kept rigid and multiple docking poses were generated for each compound. After screening, the docked poses of the compounds were ranked on the basis of the docking score (kcal/mol) generated by AutoDock Vina (Trott & Olson, 2010) . The top compounds with docking scores lower than À8.0 kcal/mol were selected as hits for further analyses. The molecular interaction of the candidate compounds was studied by re-docking each compound individually against the SARS-CoV-2 3CL pro structure using AutoDock Vina v1.1.2 program (Trott & Olson, 2010) . The protein structure (PDB ID 6LU7) in monomer form was prepared using Discovery Studio Visualizer v2.5 (Biovia, 2017) by removing the N3 peptide ligand and crystallographic water molecules. The Gasteiger partial atomic charges were assigned and polar hydrogen atoms were added to the protein using AutoDockTools (Huey & Morris, 2008) . The same program was used to generate the PDBQT file of the protein. The docking grid of size 24 Å Â 22 Å Â 26 Å (X, Y, Z) was centered using the following xyz coordinates: [À1.549, 2.454, 7.117] to cover the active site. The molecular interactions and binding modes of the top-ranked docked poses were visually inspected using the Discovery Studio Visualizer v2.5 and Pymol Molecular Graphics System v1.3 (Seeliger & de Groot, 2010) . In furtherance of validation of the docking parameters, the co-crystallized inhibitor N3 was re-docked non-covalently at same active site of SARS-CoV-2 3CL pro to assess the ability of the AutoDock Vina docking parameters to reproduce the bioactive conformation. The inhibitory constant (pK i ) was calculated based on the docking energy score using the following Equations (22)-(24): pK i ¼ 10 ½Binding energy score=1:366 In order to investigate the binding mode of the most promising hit compounds inside the active site of SARS-CoV-2 3CL pro , molecular dynamics (MD) simulations of each complex were performed for a period of 50 ns. Complex stability and interaction profile were elucidated from the simulation trajectories. All simulations were performed with the AMBER 18 simulation package (Case et al. 2018) . The same MD simulation protocol was implemented as described previously (Ikram et al., 2019; Jabbar et al., 2018; , but the length of the production run was increased to 50 ns. Moreover, the parametrization was performed by creating a new molecular topology file for each compound, starting from ligand and ending in covalently bound Cys145. The tleap program of AMBER was used to generate the topology and coordinate files of the complexes. The Antechamber package of AmberTools was utilized and parameters for the ligands (using AM1-BCC charge definitions) were generated from the AMBER force field (GAFF) (Wang et al., 2004) . The charges of each simulation system were neutralized by adding counter ions around the ligand-protease complex that was centered in a dodecahedral TIP3P (Jorgensen et al., 1983) water box with a 10-Å distance between the solute and the box edge. The covalent bonds were constrained using the SHAKE algorithm to maintain constant bond length (Ryckaert et al., 1977) . After a stepwise minimization, the system was heated and equilibrated. Finally, a production run of 50 ns was performed at 300 K and 1 bar pressure. The time step was set to 2 fs and the trajectory snapshots were saved every 2 ps and analyzed using the CPPTRAJ program (Roe & Cheatham, 2013) of AMBER. The binding free energies (DG tol ) of SARS-CoV-2 3CL pro complexed with the most promising hit compounds were calculated using the MM-GBSA method, implemented in AMBER 18. Briefly, 5000 snapshots were generated for each system from the last 20 ns stable trajectories with an interval of 2 ps. The total binding free energy is calculated as a sum of the molecular mechanics binding energy (DE MM ) and solvation free energy (DG sol ) as given below: where, DE MM is further divided into internal energy (DE int ), electrostatic energy (DE ele ), and van der Waals energy (DE vdw ), and the total solvation free energy (DG sol ) is contributed by the sum of polar (DG p ) and non-polar (DG np ) components. The MM-GBSA approach is well illustrated in binding free energy calculations (Hou et al., 2011) for antiviral inhibitors (Srivastava & Sastry, 2012; Tan et al., 2006) . The sequence alignment showed that the SARS-CoV-2 3CL pro is 96.08% and 51.82% identical to SARS-CoV (PDB: 2A5I) and MERS-CoV (PDB: 5WKK) main proteases, respectively (see Figure S1 , Supplementary material). The sequence alignment also revealed that the catalytic dyad residues His41 and Cys145 of SARS-CoV-2 main protease, are conserved among SARS-CoV-2, SARS-CoV and MERS-CoV. Furthermore, the structural alignment/superposition of the main proteases of all 3 coronaviruses revealed conserved catalytic dyad residues His41 and Cys145, inlaid at the same position in the active site with an average RMSD of 0.12 Å (Figure 1 ). These findings were consistent with a recent study reported by Martin Stoermer (Stoermer, 2020) . Therefore, the covalent docking-based virtual screening was performed against the reactive nucleophilic Cys145 to identify compounds having reactive electrophilic moieties, while the molecular (noncovalent) docking-based virtual screening was performed against the catalytic dyad residues to identify potential FDAapproved protease inhibitors for drug repurposing. Superimposition of several co-crystallized SARS-CoV 3CL pro structures bound with various inhibitors (N1, PDB ID 1WOF; I2, PDB ID 2D2D; N3, PDB ID 2AMQ; N9, PDB ID 2AMD) and recently deposited SARS-CoV-2 3CL pro complexed with N3 (PDB ID 6LU7), revealed similar positions of subsites S1, S2, and S4 near the catalytic dyad that are crucial for substrate recognition (Chang et al., 2007; Yang et al., 2005) . In SARS-CoV-2 protease, the side chains of Phe140, His163, Glu166 and His172 constitute the S1 site. The side chains of Pro52 and Tyr54, alongside the alkyl and methylene moieties of the side chains of Asp187 and Glu47, constitute a deep hydrophobic S2 subsite. Met165, Leu167, Phe185 and Gln192 are involved in forming S4 subsite. As an essential validation step before virtual screening, the co-crystalized N3 peptide was re-docked by employing the Michael addition reaction. All the docked poses of the N3 peptide successfully formed a covalent bond between the Cb of the vinyl group and the thiol group of the catalytic cysteine. The best re-docked pose of the N3 peptide (using the first centroid xyz coordinates and maximum number of 300 refined poses, see the Methods) resembles the original crystal pose closely. Especially the deeply buried valine side chain and the other residues in the middle are well positioned, while the side chains at the ends of the peptide are somewhat shifted from the original pose (see Figure 2 (A)). As a reference, the MM-GBSA score for this pose was -79.80 kcal/mol and the covalent docking affinity score was À9.551 kcal/mol. On the other hand, selecting only maximum 200 poses for further refinement in the docking process did not yield as good docking poses in the end. The second centroid coordinate option gave also reasonably highly scored poses (MM-GBSA energy varying between À67 and À90 and the docking score between À7.434 and À9.994), but none of the poses was so closely resembling the crystal pose as the best pose from the docking run with the slightly different centroid coordinates. Interestingly, when we also docked the SRSSS isomer of the N3 peptide (which is SSSSS) (Figure 2(B) ), the MM-GBSA score for the best pose (from maximum number of 200 retained poses and from 5 final output poses) was as low as -98.97 kcal/mol and the covalent docking affinity score was À8.361 kcal/mol. The best pose also resembled the original crystal pose very closely apart from the 5-methylisoxazol ring containing side chain that had more space to move at the outer edge of the active site ( Figure S2, Supplementary material) . Moreover, we also assumed that different stereochemistry at this chiral center (SSSSS to SRSSS) could be utilized in drug design, particularly peptide inhibitor design for improved selectivity. In this regard, N3 already showed quite a good binding affinity. During the MD simulation, the RMSD of the N3 re-docked pose remained comparable to that of the co-crystalized pose. After small fluctuations, where the peptide tried to adopt a more favorable conformation, both poses displayed stability inside the substrate-binding pocket for the period of the last $10 ns (Figure 2(C) ). This clearly validates the covalent docking procedure with regard to the true binding mode reproducibility. Previous studies have demonstrated Cys145 a key residue in the active site of SARS-CoV 3CL pro , which makes it an important target for covalent inhibitors (Pillaiyar et al., 2016; Pillaiyar et al., 2020; Tang et al., 2020) . The virtual screening process includes two steps: (i) the selection of the candidate ligand with an appropriate pose with its reactive group in close proximity to Cys145; (ii) and a virtual chemical reaction between the reactive groups, leading to the formation of a stable covalent bond (Pillaiyar et al., 2020; Tang et al., 2020) . Ligand poses within the distance cut-offs (reacting pair of atoms within 5 Å) are kept, and a covalent bond (S-C) is formed according to the reaction type. For the identification of potential covalent inhibitors from the AFCL library containing a variety of chemical warheads, six possible covalent binding reactions were used: Michael addition, nucleophilic addition to a double and a triple bond, nucleophilic substitution, and aryl and nitrile activated conjugate addition to alkyne. As an integral part of the covalent docking protocol, the free energies of binding were calculated with the Prime/ MM-GBSA method for all the docked poses as described in the Methods. The compounds having the lowest Prime/MM-GBSA and/or CovDock scores were considered for further molecular inspection; thus, 19 compounds from the results of the Michael addition screening, together with 23 and 2 compounds from two other reaction types: nucleophilic addition to double bond and nitrile activated conjugate addition to alkyne respectively (details in SI, Figures S3-S7) . After careful inspection of the structures and poses of these compounds, only the best 3 inhibitor candidates, compounds (cmp) 51, 78, and 223 (one from each reaction type, ranking according to the CovDock score) were selected for a detailed interaction analysis and MD simulations to analyze the binding mode and the stability of the interactions inside the enzyme active site. The docking complexes of the three candidate inhibitors were investigated for the molecular interactions governed by the covalent bond formation. The structures of the candidate inhibitors, their molecular interactions within the substratebinding pocket and the corresponding reaction sites between the chemical warheads of the candidates and the catalytic Cys145 are displayed in Figure 3 . Cmp 223 formed a covalent bond between the electrophilic b-C atom of pyrrolidine moiety with the reactive nucleophilic thiol group of Cys145 (1.83 Å C-S distance), and the oxygen atom of its terminal furan ring established a H-bond with the side chain nitrogen atom (NE) of His163 (2.47 Å) located at S1 subsite. At S1 subsite, the furan moiety also had van der Waals (vdW) interactions with Glu166 and Phe140, while the other hydrophobic contacts include Met165 and His41 at S4 subsite. Cmp 78 was ranked the best compound from the 'nucleophilic addition to double bond' screen according to the CovDock score and established a covalent bond between the b-C to the indoline moiety and the thiol group of Cys145 (1.82 Å, C-S). It also formed a H-bond between the carbonyl carbon of indoline and His163 (2.65 Å) of S1 subsite. The structural analysis revealed that the indoline moiety of 78 formed hydrophobic interactions with the S1 residues and its terminal benzene extended in the direction of the S2 subsite, where it formed p-p stacking interaction with the catalytic His41. Cmp 51 was identified as a potential covalent inhibitor from the third reaction type, nitrile activated conjugate addition to alkyne (also among the best in the Michael addition screen, according to the CovDock score). Cmp 51 established a covalent bond (1.9 Å C-S distance) with its electrophilic b-C atom present between the indole moiety and the nitrile. The indoline moiety of 51 formed hydrophobic contacts and the side chain nitrogen atom established a H-bond with Glu166 (2.51 Å) located at S1 subsite. MD simulations are considered to be a reliable approach in investigating the underlying dynamic consequences of Zoomed-in SARS-CoV-2 3CL pro substrate-binding pocket labelled with subsites S1, S2 and S4. (C-E) Representation of the chemical reaction of the reactive thiol group of Cys145 with the reactive nucleophilic group of the hit compounds, and the corresponding covalently docked poses (green sticks) inside the substrate-binding site of SARS-CoV-2 3CL pro (white ribbon presentation, ligand-interacting amino acids are shown in sticks). Atom color code: carbongreen/white; nitrogenblue; oxygenred; sulfuryellow. Hydrogen atoms are omitted for clarity. and the apo (without ligand) conformation of the main protease was plotted for the 50-ns production simulations. Overall, the main protease remained more stable in the presence of the bound inhibitors when compared to its apo conformation that displayed greater fluctuation in its backbone. As shown in Figure 4 (A), the RMSD of all three complexes increased gradually in the beginning of the MD simulation. After about 20 ns, the RMSD values converged to 1-2 Å. The initial RMSD increase in the complexes was expected due to the flexible loop connected to the helical domain III. In general, the inhibitor interaction with the protein decreased the overall protein flexibility. Among these complexes, the M pro / cmp78 complex displayed the lowest averaged RMSD value (0.5 Å). This is possibly due to the compound's best interaction profile with an additional p-p stacking interaction with the catalytic His41. However, a more detailed analysis of the flexibility of the protein backbone was possible from the per-residue root-mean-square fluctuation (RMSF). Figure 4 (B) demonstrates the RMSF for the backbone atoms of each residue in M pro . Overall, the fluctuations showed a similar trend with highest fluctuations in the loop regions. The highest amplitude of motion occurred in the long flexible loop of 16 residues (residues 184-199), which retained its freedom of movement and triggered dynamic effects in the helical domain III (residues 200-306). On the other hand, a decrease in flexibility was observed in the substrate-binding region, which revealed the influence of inhibitor interactions on the residues located in the substrate-binding pocket. The molecular mechanics energies combined with the generalized Born and surface area continuum solvation (MM-GBSA) method is a well-established approach to estimate the free energy of the binding of small ligands to biological macromolecules. The approach is typically based on MD simulations of protein-ligand complexes. It has been applied to numerous systems with varying success (Chen et al., 2016; Durrani et al., 2019; Parveen et al., 2019; Sirin et al., 2014) and better covered in various reviews (Chen et al., 2016; Gohlke & Case, 2004; Gohlke et al., 2003) . The AMBER MM-GBSA energetic components for each complex are tabulated in Table 1 . Since the RMSD analysis of all complexes showed backbone stability especially in the last 10 ns, a total of 5000 snapshots were generated and the binding free energies were calculated from the time period of 40-50 ns. For all three complexes, the contributions of van der Waals (DE vdw ), electrostatic (DE ele ), and non-polar solvation energy (DG np ) were favorable for protein-ligand interactions. All inhibitors mainly contributed to the binding energy through van der Waals interactions (<À48 kcal/mol), followed by electrostatic interactions (<À10 kcal/mol). Cmp 78 displayed the most favorable total binding free energy (DG tol ¼ À60.05 kcal/mol) as compared to the other two, which was likely due to the additional p-p stacking interaction with the catalytic His41. Apart from the molecular mechanics contributions, the solvation energy also plays a crucial role in small ligand-protein systems and the accuracy of these calculations has been extensively studied (Genheden et al., 2010; Gohlke & Case, 2004; Gohlke et al., 2003) . For all these simulation systems, the non-polar solvation energy contributed least to the total binding free energy, which is evident from the surface area accessible to the solvent as the ligands are barely exposed, while the polar solvation energy was unfavorable in all three complexes. In order to identify possible repurposable SARS-CoV-2 3CL pro inhibitors from the FDA-approved protease inhibitors, a library of 116 FDA-approved protease inhibitors was virtually screened against SARS-CoV-2 main protease. Initially, the docking protocol was validated by non-covalent re-docking of the co-crystallized ligand, inhibitor N3, into the active site of SARS-CoV-2 3CL pro . As shown in Figure S8 (Supplementary material), the top-docked pose adapted a binding mode within the active site similar to that observed for the co-crystalized ligand with an RMSD value of 0.760 Å, indicating the robustness of our docking protocol. Next, molecular docking was carried out as described in the Methods, and the drugs were ranked based on their binding energies (kcal/mol) and potential interaction with active site residues. The screening of the database revealed two potential protease inhibitors namely, paritaprevir and simeprevir with the lowest binding energy scores of À8.8 and À8.78 kcal/mol, respectively (Table 2) . The predicted inhibitory constants (pKi) against SARS-CoV-2 3CL pro based on the AutoDock Vina docking scores for paritaprevir and simeprevir were 0.36 mM and 0.37 mM, respectively. Interestingly, paritaprevir and simeprevir are FDA-approved anti-hepatitis C virus (HCV) drugs that act by targeting the NS3/4A serine protease of HCV (McConachie et al., 2016) . To our knowledge, there are no clinical data available regarding the use of these two compounds to treat SARS-CoV-2, MERS-CoV or SARS-CoV. Both compounds adopted poses near the catalytic dyad, the sulfonamide group posing near Cys145 ( Figure 5 ). Paritaprevir was found to form hydrogen bonds with Gly143 and Cys145, while simeprevir established hydrogen bonds with Gly143 and Gln189. The stability of the docked complexes was investigated with the MD simulations using the same protocol as for the covalent hit compound complexes ( Figure 6 ). Overall, the complexes remained stable. The average RMSD values for the last 40 ns for paritaprevir and simeprevir were 3.2 and 3.5 Å, while the average RMSF values were 1.2 and 1.5 Å, respectively. Moreover, the total binding free energy was calculated from the last 20 ns period and revealed favorable DG tol for paritaprevir (À47.15 kcal/mol) and simeprevir (À51.84 kcal/mol). In the current study, pharmacoinformatics and MD approaches were utilized to identify potential covalent inhibitors of SARS-CoV-2 3CL pro for the treatment of COVID-19. Several covalent inhibitors, together with FDA-approved paritaprevir and simeprevir, were identified as potential inhibitors of SARS-CoV-2 3CL pro enzyme. The binding affinity, mechanism, and stability of binding of these compounds to SARS-CoV-2 3CL pro were investigated by molecular docking and MD simulations. The potential warheads identified in this study could serve as a guideline to design covalent inhibitors targeting the catalytic Cys145. Moreover, the FDA approved anti-HCV drugs, paritaprevir and simeprevir may also play a key role in expediting the drug discovery process and could be tested in clinical trials as a treatment for COVID-19. Medicinal plant phytochemicals and their inhibitory activities against pancreatic lipase: Molecular docking combined with molecular dynamics simulation approach Coronavirus disease 2019 assosiated pneumonia in China: Current status and future prospects. Preprints, 2020020358 Coronavirus main proteinase (3CLpro) structure: Basis for design of anti-SARS drugs Identification of novel inhibitors of the SARS coronavirus main protease 3CLpro The protein data bank Remdesivir for severe acute respiratory syndrome coronavirus 2 causing COVID-19: An evaluation of the evidence Molecular dynamics simulations of SARS-CoV-2 3CL pro . (A) RMSD and (B) RMSF of SARS-CoV-2 3CL pro with bound paritaprevir and simeprevir Reversible unfolding of the severe acute respiratory syndrome coronavirus main protease in guanidinium chloride Assessing the performance of the MM/PBSA and MM/GBSA methods. 6. Capability to predict protein-protein binding free energies and rerank binding poses generated by protein-protein docking Origin and evolution of pathogenic coronaviruses Small-molecule library screening by docking with PyRx Covalent inhibitors design and discovery Investigating the molecular mechanism of staphylococcal DNA gyrase inhibitors: A combined ligand-based and structure-based resources pipeline Mutagenesis of DsbAss is crucial for the signal recognition particle mechanism in Escherichia coli: Insights from molecular dynamics simulations Molecular dynamics simulations and drug discovery Towards more effective acetylcholinesterase inhibitors: A comprehensive modelling study based on human acetylcholinesterase protein-drug complex An integrated approach towards the development of novel antifungal agents containing thiadiazole: Synthesis and a combined similarity search, homology modelling, molecular dynamics and molecular docking study Identification of FDA approved drugs targeting COVID-19 virus by structure-based drug repositioning (Version 4) Chemical and computational methods for the characterization of covalent reactive groups for the prospective design of irreversible inhibitors An MM/3D-RISM approach for ligand binding affinities The cysteine proteome Converging free energy estimates: MM-PB (GB) SA studies on the protein-protein complex Ras-Raf Insights into protein-protein binding by binding free energy calculation and free energy decomposition for the Ras-Raf and Ras-RalGDS complexes In Silico discovery of novel inhibitors against main protease (Mpro) of SARS-CoV-2 using pharmacophore and molecular docking based virtual screening from ZINC database Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 Merck molecular force field. II. MMFF94 van der Waals and electrostatic parameters for intermolecular interactions Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations Recent development and application of virtual screening in drug discovery: An overview Using AutoDock 4 with AutoDocktools: A tutorial. The Scripps Research Institute Inhibition of oncogenic kinases: An in vitro validated computational approach identified potential multi-target anticancer compounds Antigenic peptide prediction from E6 and E7 oncoproteins of HPV types 16 and 18 for therapeutic vaccine design using immunoinformatics and MD simulation analysis A hierarchical approach to all-atom protein loop prediction Structure-based drug design, virtual screening and high-throughput screening rapidly identify antiviral leads targeting COVID-19 Structure of Mpro from COVID-19 virus and discovery of its inhibitors Comparison of simple potential functions for simulating liquid water Structureguided design of potent and permeable inhibitors of MERS coronavirus 3CL protease that utilize a piperidine moiety as a novel design element The growing impact of click chemistry on drug discovery Identification and evaluation of potent Middle East respiratory syndrome coronavirus Discovery and development of telaprevir: An NS3-4A protease inhibitor for treating genotype 1 chronic hepatitis C virus Covalent inhibitors: An opportunity for rational target selectivity Crystal structures of the main peptidase from the SARS coronavirus inhibited by a substrate-like aza-peptide epoxide The VSGB 2.0 model: A next generation energy model for high resolution protein structure modeling Exploring the stability of ligand binding modes to proteins by molecular dynamics simulations Drug discovery considerations in the development of covalent inhibitors New direct-acting antivirals in hepatitis C therapy: A review of sofosbuvir, ledipasvir, daclatasvir, simeprevir, paritaprevir, ombitasvir and dasabuvir Virtual screening strategies in drug discovery Structural elucidation of SARS-CoV-2 vital proteins: Computational methods reveal potential drug candidates against main protease, Nsp12 polymerase and Nsp13 helicase Integrated computational approach for virtual hit identification against Ebola viral proteins VP35 and VP40 Towards peptide vaccines against Zika virus: Immunoinformatics combined with molecular dynamics simulations to predict antigenic epitopes of Zika viral proteins Perspectives towards antiviral drug discovery against Ebola virus Beyond cysteine: Recent developments in the area of targeted covalent inhibition MPD3: A useful medicinal plants database for drug designing Discovery of selective inhibitors for cyclic AMP response element-binding protein: A combined ligand and structurebased resources pipeline. Anti-Cancer Drugs Open Babel: An open chemical toolbox Deleterious variants in WNT10A, EDAR, and EDA causing isolated and syndromic tooth agenesis: A structural perspective from molecular dynamics simulations UCSF Chimera-A visualization system for exploratory research and analysis An overview of Severe Acute Respiratory Syndrome-Coronavirus (SARS-CoV) 3CL protease inhibitors: peptidomimetics and small molecule chemotherapy Recent discovery and development of inhibitors targeting coronaviruses The basics of thiols and cysteines in redox biology and chemistry PTRAJ and CPPTRAJ: Software for processing and analysis of molecular dynamics trajectory data OPLS3e: Extending force field coverage for drug-like small molecules The discovery and development of boceprevir Numerical integration of the cartesian equations of motion of a system with constraints: Molecular dynamics of n-alkanes Biflavonoids from Torreya nucifera displaying SARS-CoV 3CLpro inhibition Protein and ligand preparation: Parameters, protocols, and influence on virtual screening enrichments A conceptual framework for predicting the toxicity of reactive chemicals: Modeling soft electrophilicity Ligand docking and binding site analysis with PyMOL and Autodock/Vina Epik: A software program for pK a prediction and protonation state generation for drug-like molecules Middle East respiratory syndrome coronavirus infection mediated by the transmembrane serine protease TMPRSS2 Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega A computational approach to enzyme design: Predicting x-aminotransferase catalytic activity using docking and MM-GBSA scoring Molecular dynamics investigation on a series of HIV protease inhibitors: Assessing the performance of MM-PBSA and MM-GBSA approaches Homology models of coronavirus 2019-nCoV 3CLpro protease Epidemiology, genetic recombination, and pathogenesis of coronaviruses Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants In-silico identification and evaluation of plant flavonoids as dengue NS2B/NS3 protease inhibitors using molecular docking and simulation approach Discovery of novel dengue NS2B/NS3 protease inhibitors using pharmacophore modeling and molecular docking based virtual screening of the zinc database Computational screening of medicinal plant phytochemicals to discover potent pan-serotype inhibitors against dengue virus Designing of a next generation multiepitope based vaccine (MEV) against SARS-COV-2: Immunoinformatics and in silico approaches pH-dependent conformational flexibility of the SARS-CoV main proteinase (Mpro) dimer: Molecular dynamics simulations and multiple X-ray structure analyses Investigating interactions between HIV-1 gp41 and inhibitors by molecular dynamics simulation and MM-PBSA/GBSA calculations AI-aided design of novel targeted covalent inhibitors against SARS-CoV-2 Structure-based virtual screening approach for discovery of covalently bound ligands AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Fast identification of possible drug treatment of coronavirus disease-19 (COVID-19) through computational drug repurposing study Development and testing of a general amber force field Coronavirus pathogenesis Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods A new coronavirus associated with human respiratory disease in China New concepts and approaches for drug discovery based on traditional Chinese medicine Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission Design of wide-spectrum inhibitors targeting coronavirus main proteases Exploring structural variation and gene family architecture with De Novo assemblies of 15 Medicago genomes A pneumonia outbreak associated with a new coronavirus of probable bat origin Docking covalent inhibitors: A parameter free approach to pose prediction and scoring A novel coronavirus from patients with pneumonia in China Systematic review of the registered clinical trials of coronavirus diseases 2019 (COVID-19). medRxiv Coronaviruses-drug discovery and therapeutic options Authors have no conflict of interest to declare. http://orcid.org/0000-0003-4832-4250 Muhammad Usman Mirza http://orcid.org/0000-0001-9120-2414