key: cord-307227-x6xketcn authors: Martin, William R.; Cheng, Feixiong title: Repurposing of FDA-Approved Toremifene to Treat COVID-19 by Blocking the Spike Glycoprotein and NSP14 of SARS-CoV-2 date: 2020-09-10 journal: J Proteome Res DOI: 10.1021/acs.jproteome.0c00397 sha: doc_id: 307227 cord_uid: x6xketcn [Image: see text] The global pandemic of Coronavirus Disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to the death of more than 675,000 worldwide and over 150,000 in the United States alone. However, there are currently no approved effective pharmacotherapies for COVID-19. Here, we combine homology modeling, molecular docking, molecular dynamics simulation, and binding affinity calculations to determine potential targets for toremifene, a selective estrogen receptor modulator which we have previously identified as a SARS-CoV-2 inhibitor. Our results indicate the possibility of inhibition of the spike glycoprotein by toremifene, responsible for aiding in fusion of the viral membrane with the cell membrane, via a perturbation to the fusion core. An interaction between the dimethylamine end of toremifene and residues Q954 and N955 in heptad repeat 1 (HR1) perturbs the structure, causing a shift from what is normally a long, helical region to short helices connected by unstructured regions. Additionally, we found a strong interaction between toremifene and the methyltransferase nonstructural protein (NSP) 14, which could be inhibitory to viral replication via its active site. These results suggest potential structural mechanisms for toremifene by blocking the spike protein and NSP14 of SARS-CoV-2, offering a drug candidate for COVID-19. As of August 4, 2020, there are over 18 million documented cases of COVID-19 (over 675,000 resulting in death), with nearly one-third of all cases occurring in the United States (over 150,000 deaths 1 ). As of this writing, there are no FDA approved treatments or vaccines for COVID-19, both of which are sorely needed. A search on clinicaltrials.gov on June 1 for COVID-19 as the disease and the additional term "drug" yields 806 results, providing evidence for the need for an intervention. Drug repurposing allows an acceleration of the drug discovery pipeline; drugs which have already been FDA approved to treat another disease are repositioned as therapeutics for diseases for which they have not yet been used. 2, 3 In our initial network-based drug repurposing study, 4 we identified toremifene, another selective estrogen receptor modulator (SERM), as a strong candidate for the potential treatment of COVID-19. A drug repurposing study for SARS-CoV-1 5 indicated a low 50% effective concentration (EC 50 ) for toremifene, and noted that estrogen signaling may not be involved in the inhibitory pathway, similar to that of inhibition of Ebola. 6 Indeed, a crystal structure of the Ebola virus with bound toremifene indicates the interaction lies between the attachment (GP1) and fusion (GP2) protein subunits. 7 In a study using human organs-on-chips, 8 toremifene was found to significantly inhibit entry of a pseudotyped SARS-CoV-2 virus. However, the mechanism of action for toremifene in the inhibition of SARS-CoV-2 is not yet known. The actual target for toremifene has not yet been elucidated in coronaviruses as it has in Ebola, which was determined to have an 50% inhibitory concentration (IC 50 ) of roughly 1 μM. 6 Dyall et al. 5 found the EC 50 of toremifene for MERS-CoV and SARS-CoV-1 to be 12.9 μM and 11.97 μM, respectively; while these results are not different, the story is different for tamoxifen, which differs from toremifene by a substitution of hydrogen for the chlorine on toremifene. While the EC 50 for tamoxifen is slightly lower in MERS-CoV (10.1 μM), the EC 50 for SARS-CoV-1 is worse than for toremifene (92.9 μM), which could potentially indicate that the inhibition is related to a viral, and not human, protein. Importantly, a recent study indicated an IC 50 of 3.58 μM for toremifene with SARS-CoV-2. 9 There have been numerous studies recently targeting singular proteins in SARS-CoV-2 using virtual screening techniques, such as 3-chymotrypsin-like protease 10−12 and the papain-like protease. 13 Further studies have been done using large databases, such as ZINC, 14 to test large compound libraries across multiple viral proteins. In an effort to determine potential interactors with toremifene specifically, 15 we used inverse virtual screening to test 13 of the 29 viral proteins encoded by the SARS-CoV-2 genome. Proteins which did not have a crystal structure at the time of this study were modeled using homology modeling. We believe this to be a comprehensive study to combine virtual screening with molecular dynamics and molecular mechanics/Poisson− Boltzmann Surface Area (MM/PBSA) calculations to determine potential protein−ligand interactions in SARS-CoV-2 proteome. Here, we have discovered two potential targets for toremifene from the entire SARS-CoV-2 proteome. Proteins for which crystal structures were not available were constructed using homology modeling. Each sequence was accessed from NCBI by its accession number. The protein sequence was submitted to a BLAST 16 search within UCSF Chimera, 17 and the best matching structure was chosen for homology modeling. Each homology model was constructed using a single template using MODELER 9.21 18 within UCSF Chimera. The best model as determined by Z-score was used for docking. Proteins which do not have potential templates with high homology were not modeled. Generally, sequence identity of greater than 40% will yield an acceptable homology model. 19 Prior to docking, all homology models were subjected to a short minimization and equilibration period using molecular dynamics simulations. In short, each system was constructed using standard tools in GROMACS 2018.2. 20 Systems were submerged in a water box with edges no less than 10 Å from any part of the protein, and neutralized using sodium and Journal of Proteome Research pubs.acs.org/jpr Article chloride ions to an ionic strength of 0.15 M. Parameterization for the protein, ions, and water was done using the CHARMM36 force field. 21 Each protein was minimized using a steepest descent algorithm for 5000 steps, followed by a short 200 ps equilibration. The toremifene structure was downloaded from the ZINC database, 14 while all crystal structures used for docking were obtained from the RCSB protein data bank. 22 All docking was done using AutoDock Vina 23 within UCSF Chimera. The highest scoring binding pose (kCal/mol) was selected for further analysis where appropriate ( Figure 1 ). Each search generated 10 binding modes with the exhaustiveness set to the maximum value. Proteins without a clear binding region (for example, accessory protein 7a) were not included. The search grid for each protein was selected to encompass the entire protein. Each system was constructed using the solution builder in CHARMM-GUI. 24−26 Following a processing step involving the addition of hydrogens not added in docking and parametrization of the ligand, a water box with edges at least 10 Å from any part of the protein was added. The system was neutralized and brought to an ionic strength of 0.15 M using sodium and chloride ions. The CHARMM General Force Field (CGenFF) 27 was used to parametrize toremifene, while the CHARMM36m force field was used to parametrize the protein, ions, and TIP3P water molecules. All systems were simulated using GROMACS 2020.1 on the AiMOS Supercomputer at the Rensselaer Polytechnic Institute Center for Computational Innovations in a three-step process. Initial minimization of the systems was run until changes in the potential energy of the system reached machine precision. Following minimization, an NVT equilibration step was completed with a 2 fs time step for 500,000 steps using 400 kJ mol −1 nm −2 and 40 kJ mol −1 nm −2 positional restraints on the backbone and side chains, respectively. A 500 ns production step was completed using the NPT ensemble with no position restraints and a 2 fs time step. Hydrogen atoms were constrained using the LINCS 28 algorithm. Temperature for the system was held at 300 K using a Nose-Hoover thermostat 29 with a 1 ps coupling constant. For the production simulation, pressure was coupled isotropically using a Parrinello−Rahman barostat 30 with a 5.0 ps coupling constant and compressibility of 4.5 × 10 5 bar −1 to maintain a pressure of 1 bar. The pair-list cutoff was constructed using the Verlet scheme 31 with a cutoff distance of 1.2 nm. Particle mesh Ewald electrostatics 32 were used to describe Coulombic interactions with a 1.2 nm cutoff, while van der Waals forces were smoothly switched to between 1.0 and 1.2 nm using a force-switch modifier to the cutoff scheme. Linear center of mass translation was removed every 100 steps for the entire system. SARS-CoV-2 has a roughly 30 kb genome which encodes 29 proteins. 34 These 29 proteins include 16 nonstructural proteins (NSP), 4 structural proteins, and 9 accessory proteins. The nonstructural proteins include proteinases (NSP3, NSP5), RNA polymerases (NSP12), helicases (NSP13), ribonucleases (NSP14, NSP15), and methyltransferases (NSP14, NSP16), while structural proteins are involved in viral assembly (envelope, membrane) and binding with the host protein (spike glycoprotein). Molecular docking over the entire protein was carried out on 13 of the 29 possible viral proteins in SARS-CoV-2. These 13 were selected based on a combination of criteria: (1) whether a crystal structure for the protein exists; (2) if a crystal structure does not exist, is there a template protein (generally from SARS-CoV-1) available to use for homology modeling; (3) based on either the crystal structure or homology modeling, is there a potential binding pocket. Proteins which did not meet these criteria (for example, the membrane protein has partial homologous coverage at ∼20% homology; protein 3a does not appear to have a potential binding pocket) were not chosen for the docking study. The best scoring poses are tabulated in Table 1 , with a larger negative number indicating a better binding affinity. Unsurprisingly, the affinity was not high for a few of the smaller systems (NSP1 and the nucleocapsid, for example), which were included as a sort of negative control; it was not expected that good binding would be achieved with these systems. Interestingly, the spike glycoprotein and NSP14 had the highest binding affinities based on molecular docking. To determine which systems would be selected for further analysis via simulation to better determine if a particular protein−ligand binding pose maintains its integrity, we visually inspected the binding of each system. While a hard cutoff was not selected, certain systems were rejected for further analysis. Systems which did not have strong binding within a clear pocket were not simulated, including NSP1, NSP7, NSP9, the helicase, and the nucleocapsid. As an example, the interaction with NSP9 does not involve a binding pocket, but sits on the surface of the protein. We have no expectation that such an interaction would be maintained throughout a simulation, and therefore did not include it in the next step. Each of the above systems with a strong predicted binding was simulated according to the protocol listed in the methods. Here, we have monitored the protein−ligand interaction over the entire 500 ns trajectory. Unsurprisingly, due to the low binding affinities, the interaction between the ligand and the protein was not maintained throughout the trajectory in most systems. Table 2 indicates the length of simulation before a particular protein−ligand system lost its interaction. Both the interaction between toremifene and NSP14, as well as that between toremifene and the spike glycoprotein, were maintained (and within the original binding pocket) throughout the entire 500 ns trajectory. These two systems were analyzed further to determine the nature of the protein− ligand interactions. To ensure that the results were not simply a result of poor initial docking poses, all systems for which toremifene did not maintain contact with the initial docked region were redocked using SwissDock 42 selecting the "accurate" setting under "extra parameters", with flexible residues within 5 Å of the docked ligand. The implementation of AutoDock 23 used sacrifices the inclusion of flexible residues in the binding pocket for significantly improved speed, while SwissDock allows us to implement a different algorithm, while also including flexible residues (at significantly higher computational cost). A specific region of interest was not defined. These newly docked systems were also simulated in the same fashion as those docked with AutoDock, and yielded similar results; all simulations resulted in a residence time less than 200 ns. The initial docking position (Figure 2a ) lies at the interface between two separate domains in the spike glycoprotein, with one domain having its receptor binding domain (RBD) in the "up" position, while the other has its RBD in the "down" position. In the B domain of PDB ID 6VSB, the interaction with the spike glycoprotein is with the helical region between the loop separating the S1 and S2 subunits and the fusion peptide, shown in SARS-CoV-1 to mediate membrane fusion in a calcium-dependent manner, 43 while the interaction in the A domain involves the N-terminal region of the RBD, as well as heptad repeat 1 (HR1), a key component of the fusion core (Figures 2b and 4a) . While the nonhelical linker between T941 and L945 did extend to K947 in the B and C domains, the remainder of this helical region remained unperturbed when compared to the crystal structure, with a long helix between L948 and S967. However, the interaction with toremifene resulted in a helical region between S943 and Q954, with a short helical region from A958 through T961 and the remainder unstructured. An interaction between the dimethylamine region of toremifene and Q954 and N955 appears to be key in perturbing the secondary structure of HR1. In an effort to determine the strength of the interaction, we carried out an MM/PBSA binding affinity calculation between the entire protein and toremifene. Our calculation resulted in a final binding energy of −91.036 (±0.933) kJ/mol. The main contributors (Figure 2c ) to this binding energy were V772 and L861 in the B domain due to large nonpolar interactions (−9.7 and −4.0 kJ/mol, respectively), and Y313 in the A domain due to a chlorine−π interaction (−3.7 kJ/mol). The chlorine−π interaction could indicate a potential mechanism by which toremifene has a stronger inhibitory action than tamoxifen as seen in SARS-CoV-1. NSP14 has both exoribonuclease and methyl transferase activity; here, we have found a strong interaction with the N7-methyl transferase domain (Figure 3a ). Throughout the molecular dynamics simulation, very little movement of the ligand was observed. The docked position appears as though it would potentially be inhibitory to interaction with the functional ligand S-adenosyl methionine, while clearly being inhibitory to interaction with its substrate, Gppp-RNA (Figures 3c and 4b) . As with the spike protein, we assessed the binding affinity using MM/PBSA, finding a significant π−π interaction with F426 (−8.0 kJ/mol), a chlorine−π interaction with F506 (5.0 kJ/mol), a strong hydrophobic interaction with C309 (−4.2 kJ/mol), and a total binding energy of −119.805 (±1.013) kJ/mol. Many of the residues identified here as interacting with toremifene ( Figure 3b) were identified in the crystal structure used to generate our homology model as interacting with Gppp-RNA. We have demonstrated two plausible targets for toremifene in SARS-CoV-2. Previous work has indicated that, as noted earlier, toremifene is likely to be inhibitory to viral entry. 6,7 A proposed mechanism for Ebola posited a mechanism by which fusion between the viral and endosomal membranes is disrupted. 7 The interaction with the spike glycoprotein proposed here could prevent such a fusion through its disruption of the HP1 helix. The interaction with NSP14 elucidated here would indicate an inhibition of viral reproduction by interfering with interaction with the substrate. Toremifene, a first generation nonsteroidal SERM, shows striking activity in blocking multiple viral infections, including Ebola 6,7 (IC 50 ≈ 1 μM), MERS-CoV 5 (EC 50 = 12.9 μM), SARS-CoV-1 5 (EC 50 = 11.97 μM), and SARS-CoV-2 9 (IC 50 = 3.58 μM), in established, virus-infected human cell lines. The inhibition in Ebola by SERMs has been found to not be via the estrogen receptor pathway, but instead it disrupts endolyso- 44 Toremifene has been approved for the treatment of breast cancer for over 25 years, 45 and has been investigated as a treatment for prostate cancer. 46 In this study, we have determined two potential targets for toremifene, the spike glycoprotein and NSP14. Toremifene has been used in postmenopausal women with breast cancer, premenopausal women, and men with prostate cancer in clinical trials involving thousands of participants since its inception in 1997. 47 Liver toxicity has occurred in only long-term studies and is not shown in short-term studies of <8 weeks. 48 Therefore, there is a low risk of safety concern with the use of toremifene 60 mg daily for short-term treatment of COVID-19, such as 2 weeks. Based on human studies, the mean plasma concentration of toremifene during administration of 60 mg/ day was 0.88 mg/L (2.17 μM) in postmenopausal breast cancer patients. 49 The peak plasma concentration of toremifene (60 mg/day) was over 10 μM at the 4 h administration in patients, 50 which is ∼3-fold of the antiviral effect on SARS-CoV-2 (IC 50 = 3.58 μM). An animal model study showed that toremifene (peak plasma concentration of 2.98 μM) was needed to protect 50% of mice from death caused by Ebola virus infection, a more fatal virus than SARS-CoV-2. The combination of consistent findings from in vitro antiviral data supporting the antiviral effects of toremifene against coronaviruses, along with reasonable tolerability, provide the basis to pursue toremifene as a viable candidate therapeutic against SARS-CoV-2. Caution, however, must be exercised when interpreting these results. There are several limitations for molecular docking approaches. For example, molecular docking scores could yield false positives when proteins have significant conformational changes. Although we employed molecular dynamics simulation to verify our findings from molecular docking, further experimental validation is warranted in the near future. It should be noted that, to this date, there are no crystal structures published for the spike glycoprotein with an inhibitory ligand. Similarly, no inhibitors have been crystallized with NSP14, nor has a structure for NSP14 in SARS-CoV-2 been crystallized, though the homology with SARS-CoV-1 is indeed high. However, work on the flavivirus methyltransferase 38 has indicated there is a potential for inhibition. While the proposed mechanisms for SARS-CoV-1 in toremifene are posited to not involve human proteins, further study would be required to confirm the mechanism. While in vitro studies on toremifene do exist for SARS-CoV-2, there are no studies to date that allow a comparison with other SERMs, as has been done with SARS-CoV-1 and MERS-CoV. Future work will be needed to confirm these results; optimally, the determination of a cocrystal structure with Journal of Proteome Research pubs.acs.org/jpr Article NSP14 and/or the spike glycoprotein from SARS-CoV-2 with toremifene would be solved. Alternatively, other functional validation analyses (in vitro assays, for example) could be carried out to determine the correct protein target. The final poses could also be used to inform further work to refine a potential inhibitor using medicinal chemistry techniques. The present study has demonstrated two potential targets for toremifene in SARS-CoV-2. We have demonstrated a potential mechanism for inhibition via the spike glycoprotein, through which an interaction with Q954 and N955 in heptad repeat 1 appears to disturb the secondary structure, resulting in what is normally a long α-helix instead having a lack of secondary structure. Additionally, the interaction we found with NSP14 appears as though it could be inhibitory in two ways: First, it appears as though there would be steric hindrance between toremifene and any functional ligands (such as S-adenosyl methionine), as well as interfering with substrate interaction in the catalytic pocket. Further experimental and functional studies would be needed to validate these findings, such as a cocrystal structure of the spike glycoprotein or NSP14 with toremifene. ■ ASSOCIATED CONTENT The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jproteome.0c00397. Initial docked conformations and conformations at the time listed in Table 2 for NSP3 (PL-PRO), NSP4, NSP5, NSP12, 375 NSP15, and NSP16 are provided in Figures S1-S6. RMSD plot for all proteins stimulated is provided in Figure S7 . (PDF) An interactive web-based dashboard to track COVID-19 in real time Drug repurposing: New treatments for Zika virus infection? G Systems biology-based investigation of cellular antiviral drug targets identified by gene-trap insertional mutagenesis Network-based drug repurposing for novel coronavirus 2019-nCoV/ SARS-CoV-2 Repurposing of clinically developed drugs for treatment of Middle East respiratory syndrome coronavirus infection FDAapproved selective estrogen receptor modulators inhibit ebola virus infection Toremifene interacts with and destabilizes the Ebola virus glycoprotein Human organs-on-chips as tools for repurposing approved drugs as potential influenza and COVID19 therapeutics in viral pandemics Identification of antiviral drug candidates against SARS-CoV-2 from FDA-approved drugs Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants Prediction of the SARS-CoV-2 (2019-nCoV) 3C-like protease (3CLpro) structure: virtual screening reveals velpatasvir, ledipasvir, and other drug repurposing candidates Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease Potential inhibitors against papain-like protease of novel coronavirus (SARS-CoV-2) from FDA approved drugs ZINC: A free tool to discover chemistry for biology How to discover antiviral drugs quickly Gapped BLAST and PSI-BLAST: a new generation of protein database search programs ChimeraA visualization system for exploratory research and analysis Comparative protein structure modeling using modeller All are not equal: A benchmark of different homology modeling programs High performance molecular simulations through multi-level parallelism from laptops to supercomputers Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ 1 and χ 2 Dihedral Angles The Protein Data Bank AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading CHARMM-GUI ligand reader and modeler for CHARMM force field generation of small molecules Simulations Using the CHARMM36 Additive Force Field A webbased graphical user interface for CHARMM CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields LINCS: A linear constraint solver for molecular simulations A configurational temperature Nose-Hoover thermostat Polymorphic transitions in single crystals: A new molecular dynamics method Particle mesh Ewald: An N · log(N) method for Ewald sums in large systems Tool for High-Throughput MM-PBSA Calculations Novel β-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus Crystal structure of the C-terminal cytoplasmic domain of non-structural protein 4 from mouse hepatitis virus A59 Structure of the RNAdependent RNA polymerase from COVID-19 virus Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus Nsp13 upon ATP hydrolysis Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex Crystal structure of Nsp15 endoribonuclease NendoU from SARS-CoV-2 Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation SwissDock, a proteinsmall molecule docking web service based on EADock DSS The SARS-CoV fusion peptide forms an extended bipartite fusion platform that perturbs membrane order in a calcium-dependent manner Selective inhibition of Ebola entry with selective estrogen receptor modulators by disrupting the endolysosomal calcium Toremifene in the treatment of breast cancer Toremifene, a selective estrogen receptor modulator, significantly improved biochemical recurrence in bone metastatic prostate cancer: a randomized controlled phase II a trial Association between serum insulin-like growth factor-I levels and thyroid disorders in a population-based study Pharmacokinetics of toremifene and its metabolites in patients with advanced breast cancer A review of its pharmacological properties and clinical efficacy in the management of advanced breast cancer The authors declare no competing financial interest. We acknowledge support from the Ohio Supercomputer Center, the Oak Ridge Leadership Computing Facility, the IBM ResearchAI Hardware Center, and the Center for Computational Innovation at Rensselaer Polytechnic Institute for computational resources.