key: cord-274252-h4occy7h authors: de Lima Menezes, Gabriela; da Silva, Roosevelt Alves title: Identification of potential drugs against SARS-CoV-2 non-structural protein 1 (nsp1) date: 2020-07-13 journal: Journal of biomolecular structure & dynamics DOI: 10.1080/07391102.2020.1792992 sha: doc_id: 274252 cord_uid: h4occy7h Non-structural protein 1 (nsp1) is found in all Betacoronavirus genus, an important viral group that causes severe respiratory human diseases. This protein has significant role in pathogenesis and it is considered a probably major virulence factor. As it is absent in humans, it becomes an interesting target of study, especially when it comes to the rational search for drugs, since it increases the specificity of the target and reduces possible adverse effects that may be caused to the patient. Using approaches in silico we seek to study the behavior of nsp1 in solution to obtain its most stable conformation and find possible drugs with affinity to all of them. For this purpose, complete model of nsp1 of SARS-CoV-2 were predicted and its stability analyzed by molecular dynamics simulations in five different replicas. After main pocket validation using two control drugs and the main conformations of nsp1, molecular docking based on virtual screening were performed to identify novel potential inhibitors from DrugBank database. It has been found 16 molecules in common to all five nsp1 replica conformations. Three of them was ranked as the best compounds among them and showed better energy score than control molecules that have in vitro activity against nsp1 from SARS-CoV-2. The results pointed out here suggest new potential drugs for therapy to aid the rational drug search against COVID-19. Communicated by Ramaswamy H. Sarma The coronavirus (CoV), a diversified group, are virus from Coronaviridae family (Saif et al., 2019) . CoVs are enveloped, spherical or pleomorphic shape, 100 nm dimeter and it has about 30 kb of single-strand plus-sense RNA genome-the longest among RNA viruses (Al Hajjar et al., 2013; Rota et al., 2003) . The CoVs are subdivided in four genres: Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus and only Alpha-and Betacoronavirus are able to express nonstructural protein 1 (nsp1) (Chan et al., 2012; Lau et al., 2015) . The nsp1 size is different depending on the genus. The nsp1 expressed by Alphacoronavirus has $9 kDa and Betacoronavirus encodes $20 kDa nsp1 protein (Shen et al., 2019) . The nsp1, present in cytoplasm of infected cells, has been described for having unique and conserved biological functions such as host mRNA degradation, suppression of interferon (IFN) expression and host antiviral signaling pathways. For these reasons, nsp1 is considered one of possible major virulence factor (Kamitani et al., 2009; Narayanan et al., 2008; Prentice et al., 2004; Shen et al., 2019; Wathelet et al., 2007) . Considering that nsp1 degrades host mRNA, the analysis of nsp1 structure of SARS-CoV demonstrates that the positively charged area of protein surface involving residues K48, R125 and K126 are probably related to interaction with mRNA (Almeida et al., 2007) . Moreover, the K164A and H165A mutations in SARS-CoV nsp1 caused loss of RNA cleavage and translation inhibition functions and, for these reasons, it is suggested that nsp1 access host protein and factors through its C-terminal region (Nakagawa et al., 2018; Narayanan et al., 2008) . Over the past 20 years, SARS and MERS-CoV have been responsible for two major pandemics. Recently, at the end of 2019, another pneumonia outbreak was reported in Wuhan province, China and on 7 January 2020 it was confirmed that it was caused by a novel CoV, SARS-CoV-2 (Lu et al., 2020) . Several therapeutic options have been reported to have in vitro activity against CoVs, however no drug or vaccine against human CoV has been approved for use, except Remdesivir, which has been approved by Food and Drug Administration (FDA) for emergency use against COVID-19 in United States of America (Li & Clercq, 2020) . Thus, in the last months many in silico studies has been reported potential molecules for COVID-19 therapy. Most of these studies target SARS-CoV-2 main protease since it cleaves the viral polyprotein to produce functional proteins and its inhibition could lead to virus elimination (Choudhury, 2020; Enmozhi et al., 2020; Islam et al., 2020; Muralidharan et al., 2020; Pant et al., 2020) . For those studies using nsp1 as target, alisporivir and cyclosporine have been related for having inhibition activity against CoVs (Carbajo-Lozoya et al., 2014; Pfefferle et al., 2011) . However, many treatments with potential activity reported for SARS-and MERS-CoV have one or more limitation that prevent trial from advancing beyond the in vitro stage, including EC 50 /C max ratio and immunosuppression effects seen in cyclosporine (Zumla et al., 2016) Thus, with the emergence of new CoVs that cause serious diseases in humans, the need to study and develop drugs that are effective for both circulation and already known CoVs, as well as new CoVs that may arise, is urgent. For this purpose, using computational tools we presented three potential drugs for use in therapy against COVID-19. Nsp1 SARS-CoV-2 structure modeling Nsp1 SARS-CoV-2 protein was first modeled using Rosetta (Kim et al., 2004) modeling server. The genome strain used in crystal resolution was deposited on GenBank (NCBI Reference Sequence: YP_009725297.1), and from it nsp1 amino acid sequence was obtained for submission to Robetta server. The model validation was done through MolProbity server (Chen et al., 2010) . Specific protonation of histidine residues at 7.4 pH was predicted by H þþ server (Anandakrishnan et al., 2012) . At the end of this step, a SARS-CoV-2 nsp1 protein structure was generated. Molecular dynamics (MD) simulations have been widely used to understand the behavior of proteins in solution, as well as extract relevant information associated with their functions (Childers & Daggett, 2017) . In our recent work it has been possible to understand important mechanisms involved in the stability of non-structural proteins of others viral species (Gonçalves et al., 2019; Menezes et al., 2019) . Thus, in order to study dynamics of nsp1 from SARS-CoV-2, its structure was taken to MD simulation using GROMACS 5.1.2 software (Abraham et al., 2015) in AMBER ff99SB-ILDN force field. System was neutralized with sodium ions (Na þ ) and a rectangular box was mounted 10 Å away from any protein atom and filled with water molecules TIP3P type (Jorgensen et al., 1983) . Initial system energy minimizations with complete protein restriction and without any restriction until the maximum tolerance of 250 kJ/mol was not exceeded, or until reaching the limit of 5000 steps were performed. When protein had its position restrained, a force constant of 1000 kJ/mol nm 2 was applied for positional restraints on all heavy atoms. Then the system was subjected to a 100 ps simulation of NVT (canonical ensemble) and NPT (isothermal-isobaric ensemble) to balance the thermodynamic variables, with the protein restricted in its positions. In the phase of pressure variation (NVT), the temperature was adjusted by the thermostat at 310 K and the velocities were calculated from Maxwell equations. In the simulation where the volume is allowed to vary (NPT), the pressure was kept constant by the Parrinello-Rahman barostat (Hutter, 2012; Iannuzzi et al., 2003) . After these initial dynamic steps, protein model was subjected to a 150 ns simulation, at 310 K temperature, 1 bar pressure and 2 fs time interval, without conformation restriction. System consisted of 2768 protein atoms, 50,070 water molecules and 8 Na þ atoms. Furthermore, this MD protocol was performed in five replicas in order to ensure the fluctuation trajectory profiles. Root Mean Square Deviation (RMSD) was done to perform trajectory analysis based on initial protein structure performed by gromos algorithm as described by Daura et al. (1999) . To determine conformations that were most present along trajectory, it was used g_cluster program of GROMACS package. To distinguish conformational sets (clusters) based on RMSD profile a 0.3 nm cut-off was defined. The protein profile along the MD simulation was observed based on cluster analysis and RMSD. In addition, the Root Mean Square Fluctuation (RMSF), which determines residues average fluctuation along trajectory, was also calculated from GROMACS Tools package (Abraham et al., 2015) . The UCSF Chimera and Visual Molecular Dynamics software were used to visualize protein behavior along trajectory (Humphrey et al., 1996; Pettersen et al., 2004) . Due to lack of active site information a blind docking using two cyclophilin inhibitors described as potential CoV suppressors (Carbajo-Lozoya et al., 2014; Kamitani et al., 2009; Pfefferle et al., 2011) were performed by AutoDock Vina (Morris et al., 2009 ) with nsp1 conformations in order to define the best pocket for virtual screening. These compounds were therefore considered as controls compounds. For each compound were performed 1000 independent docking simulations to evaluate the best pockets and how the compounds are arranged in them. After that, based on AutoDock Vina energy scores, it was possible to identify the main residues/regions involved in the interactions and thus define the target pockets to be used in the virtual screening simulations with DrugBank database. Combining MD simulation results and virtual screening in our recent studies has allowed us to select molecules with biological in vitro activity (Costa et al., 2015; da Silva et al., 2019) . For the same purpose, we have applied similar protocols to those previously carried out. The grid definition was based on the docking results described above and a literature review, which indicates that C-terminal region is an important for the biological functions of SARS-CoV and MERS-CoV nsp1, suggesting that nsp1 access target host protein/factors through it (Nakagawa et al., 2018; Narayanan et al., 2008) . All 3D structures from DrugBank database (8694) were screened by docking simulations following two steps. In the first step all 8694 compounds were screened for each replica conformation based on the AutoDock Vina energy scores the 150 best compounds were selected. In the second step, 1000 independent simulations were performed for each of the showing binding pockets from replicas 1 to 5. All these pockets are in the C-terminal region as it can be observed by sticks residues. (a), (c) and (e) are representing best binding pocket with cyclosporine (green licorice) molecule. In (b) is demonstrated the two main pockets for alisporivir (yellow licorice) and cyclosporine (green licorice). Only the C-terminal pocket, where cyclosporine is bonded, was chosen for virtual screening. The other pocket of replica 4 (d) was not possible to show once it is in the opposite side of C-terminal pocket, thus only alisporivir (yellow licorice) binding to C-terminal region, the region chosen for virtual screening, is represented. compounds selected above. Finally, the compounds with lowest energy that appeared in all five nsp1 conformations were selected for further analysis. Also, alisporivir and cyclosporine were docked in the same pockets for more reliable comparison. Comparison among molecule structures was performed by ChemmineR package for R software (Cao et al., 2008) . In order to access the specificity of the replicas pockets, these were compared to human proteins deposited in Protein Data Bank (PDB). All structures with 70% sequence identity cutoff was downloaded, resulting in 7535 protein structures. For this propose, it was used TM-align algorithm (Zhang, 2005) and TM-Score was accessed to define similarity between pocket and protein. We have defined a TM-Score threshold greater than 0.5 so that the structures are considered to have some similar content. If so, the energy scores of molecular docking simulations with such structures and the selected compounds should be compared. Otherwise, the pockets will be considered nsp1 SARS-CoV-2 specific. Affinity energies provided by AutoDock Vina software were compared with Kruskal-Wallis and Wilcoxon test. Analyses were conducted with R software version 3.6.1 (http://www.rproject.org). A flowchart of methodology can be seen in Supplementary Figure 1 . Nsp1 SARS-CoV-2 model and molecular dynamics Robetta server modelled nsp1 SARS-CoV-2 protein containing 180 amino acid residues using comparative modeling. The main templated used for comparative modeling was nsp1 from SARS-CoV (PDB:2GDT). The RMSD, TM-Score and identity between them are 0.91, 0.95281 and 86.09%, respectively, suggesting they are homologous structure. In model validation, the Clashscore and MolProbity score were 3.31 (97th percentile) and 1.57 (93rd percentile), respectively. The Ramachandran plot of this model shows that 93.6% and 98.8% of all residues were in favored and allowed regions, respectively (Supplementary Figure 2) . These results together suggest that Robetta server predicted a high-quality model suitable to be submitted to MD simulation. All five independent 150 ns MD simulation (a total of five replicas) can be seen in Figure 1a . Replicas 2 and 4 are more flexible, since their equilibrium RMSDs deviate almost twice as much in relation to the others. This means that replicas 2 and 4 had a larger structure adjustment until reach their stability. However, all replicas became stable after 110 ns of simulation. The RMSF of the residues (Figure 1b) shows that for all replicas had a similar flexibility for segments 136 to 143 and N-terminal e C-terminal segments. However, replicas 2 and 4 show more intense flexibility in these segments and are more flexible in other adjacent segments compared to the other replicas. As already described for SARS-CoV nsp1 (Almeida et al., 2007) the residues 1 to 12 and 129 to 180 are flexibly disordered. Detailed RMSD by residue along trajectory can be seen in Supplementary Figure 3 . The cluster analysis for each replica can be seen in Supplementary Figure 3 . It is noticed that they have different profile with different numbers of conformation groups for each replica. However, it is interesting to note that there is a cluster for each replica that appears at least from half of Pairwise structure comparison between top three compounds. In red is highlighted the common segment between molecules. These analyses were performed with ChemmineR package (c) Energy score boxplot of top 3 DrugBank compounds and the control molecules. ÃÃÃÃ p < 0.001. Ã Compound number is same from Table 1. simulation and stay in the end. As the RMSD was more stable in the end of simulation, it is conceivable that the representative structure of this last cluster is the most stable structure. The major structural differences among these structures are located in the most flexible regions (residues 1-12 and 129-180) (see Supplementary Figure 3F ). The RMSD among all representative clusters is 9.676 Å. When only less flexible segment is used for RMSD calculation, this value drops to 1.782 Å and the opposite (RMSD of flexible regions) is 13.648 Å. This means that the core region is very conserved when protein is in solution. For these reasons mentioned above, these representative structures were chosen for docking simulations. In blue licorice is represented Tirilazad compound, and green licorice is ns1 residues (replica 5) from interaction interface. Inferior chart: 2D interaction plot representing the same interaction compound-pocket. (b) Interaction between Phthalocyanine compound and receptor. Superior chart: 3D representation: In blue licorice is represented compound 6, and green licorice is ns1 residues (replica 2) from interaction interface. Inferior chart: 2D interaction plot representing the same interaction compound-pocket. (c) Interaction between Zk-806450 compound and receptor. Superior chart: 3D representation: In blue licorice is represented compound 12, and green licorice is ns1 residues (replica 2) from interaction interface. Inferior chart: 2D interaction plot representing the same interaction compound-pocket. All those interaction representations were done at Discovery Studio software. When the blind docking using the compound controls (alisporivir and cyclosporine) was performed, this allowed not only to identify where the compound bonded but also to measure the magnitude of these bonds to compare with the other compounds. The best pocket was defined as the one where compounds bonded with low energy score. By this criteria, one or two main pockets (for replicas 2 and 4) were observed depending on the replica. However, in all replicas the compounds bonded to a pocket near the C-terminal region (Figure 2) , where it is probably related to the local where nsp1 access target host protein/factors. Thus, these near C-terminal pockets were chosen to perform molecular docking using Drug Bank data base. Thus, after pocket validation, virtual screening was performed. The virtual screening after the two steps described previously selected 150 compounds for each replica based on energy score. Among these compounds, 16 were present in all replicas, which means that these molecules have a good energy score independent of nsp1 protein conformation. Among them, Tirilazad (DB13050), Phthalocyanine (DB12983) and Zk-806450 (DB2112) were ranked as the best Figure 5 . Interaction diagram of 16 compounds in common to all five replicas sorted by energy score (from highest to lowest). Residues with background colored in orange, highlights the main three regions where it is observed the more compound interacts with amino acids from these regions, the more energy score decreases. compounds based on energy score (lower energy) ( Table 2) . It is interesting to note that they are very different molecules as it can be observed in tanimoto heatmap ( Figure 3a ) and in the Maximum Common Substructure (MCS) searching performed by ChemmineR ( Figure 3b) . As compared to controls, it is observed that they have a lower energy score ( Figure 3c ). Compounds Tirilazad and Phthalocyanine are grouped as "investigational" at DrugBank. Both compounds have been used in trials study for other diseases. Compound Zk-806450 is grouped as "experimental" (discovery-phase). The binding energies for Tirilazad, Phthalocyanine and Zk-806450 compounds were similar or even better than those potential inhibitors observed for RNA dependent RNA polymerase (RdRp) protein, suggesting that a combined therapy targeting both nsp1 and RdRp proteins may be promising (Elfiky, 2020) . When profile interaction for each compound is analyzed, it is noticed that for Tirilazad alkyl and pi-alkyl interactions in a four rings region (Figure 4a ) increases ligand energy score. When these interactions are lost in this region, it is observed a decreasing energy score (Supplementary Figure 4 inferior chart). For Phthalocyanine, a similar pattern is observed. The difference is that pi-alkyl interactions occur in two regions, comprising three aromatic rings (Figure 4b ). It is noticed that when there is less pi-alkyl interaction in those regions concomitantly, energy score increases (Supplementary Figure 5 inferior chart). Zk-806450 compound also shows an important fragment for interaction. It is five rings (including 4 aromatics) region (Figure 4c ) that when interacts with neighbor residues forming pi-alkyl and hydrogen bond interactions, its energy score increases (Supplementary Figure 6 superior chart). All these important fragments for each molecule can be seen in Supplementary Figure 7 . It interesting to note that depending on the replica, interaction pattern changes, since N-and C-terminal regions are flexible (Supplementary Figure 8) . However, LYS120 residue is present virtually in all interaction no matter which replica is. This residue is not conserved among SARS-CoV and SARS-CoV-2 (see Supplementary Figure 9 ). Future studies should be performed to analyze the effect of this N120K mutation. Moreover, when we sort this interaction by energy score, we notice three important regions: LEU4, VAL5, GLY7, PHE8, THR12, HIS13, VAL14; PHE157, GLU159, ASN160, TRP161, ASN162, THR163; and MET174, ARG175, GLU176, LEU177, ASN18, GLY179, GLY180. In general, the more compound interacts with amino acids from these regions, the more energy score decreases ( Figure 5) . Also, nsp1 sequence alignment from SARS-CoV and SARS-CoV-2 (Supplementary Figure 9) shows these residues (except PHE8, PHE157, GLU159 and MET174) are conserved among these viral species which suggests these drugs may be effective against nsp1 of different severe acute respiratory syndrome-related CoVs. When the selected pockets of all five replicas were compared to all human protein structure at PDB showed TM-Score < 0.5 (Figure 6 ), which means that these nsp1 pocket structures are not present in any human protein of PDB. Thus, it is reasonable to suggest that these compounds will not bind with the same energy score to human proteins when compared to nsp1 protein. In the present work, we were able to find out three molecules from DrugBank database that showed lower energy scores when compared to alisporivir and cyclosporine, two compounds with in vitro activity against nsp1 from SARS-CoV. These results are suggestive that they may have higher inhibition effectiveness. Also, this work compared the binding pocket of all five replicas to human protein from PDB. All TM-Score from this analysis were < 0.5, which means that the nsp1 pocket folding is not present in any human protein from PDB. This may be interesting for target specificity prediction. Figure 6 . TM-Score: nsp1 pocket vs human proteins. Boxplot describing distribution of TM-Score comparing the pocket of each nsp1 replica and human protein structures. All TM-Scores were lower than 0.5, suggesting these pockets folding are not present in any human protein structure. Since there is not in vitro experiment against nsp1 from SARS-CoV-2, this in silico study has potential limitation. Although computational methodology tools have been improving in the last years, they may not accurately represent biological system since this is a very complex system. However, these results can be used to support hypothesis to perform in vitro assays. Although here it was focused in only one protein, which cannot inhibit viral replication completely, a drug against nsp1 in association with other drugs targets may be a powerful treatment for COVID-19, and probably diseases caused by other CoV species. In order to do so, in vitro studies and drug screening aiming new targets should be performed. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers Middle East respiratory syndrome coronavirus (MERS-CoV): A perpetual challenge Novel beta-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus Hþþ 3.0: Automating pK prediction and the preparation of biomolecular structures for atomistic molecular modeling and simulations ChemmineR: A compound mining framework for R Human coronavirus NL63 replication is cyclophilin A-dependent and inhibited by non-immunosuppressive cyclosporine A-derivatives including Alisporivir The emerging novel Middle East respiratory syndrome coronavirus: The "knowns" and "unknowns" Is the discovery of the novel human betacoronavirus 2c EMC/2012 (HCoV-EMC) the beginning of another SARS-like pandemic? MolProbity: All-atom structure validation for macromolecular crystallography Insights from molecular dynamics simulations for computational protein design Fragment tailoring strategy to design novel chemical entities as potential binders of novel corona virus main protease Alkaloids as inhibitors of malate synthase from Paracoccidioides spp.: Receptor-ligand interaction-based virtual screening and molecular docking studies, antifungal activity, and the adhesion process Peptide folding: When simulation meets experiment Andrographolide as a potential inhibitor of SARS-CoV-2 main protease: An in silico approach Identification of a new antifungal compound against isocitrate lyase of Paracoccidioides brasiliensis Ribavirin, Remdesivir, Sofosbuvir, Galidesivir, and Tenofovir against SARS-CoV-2 RNA dependent RNA polymerase (RdRp): A molecular docking study Dynamic behavior of Dengue and Zika viruses NS1 protein reveals monomer--monomer interaction mechanisms and insights to rational drug design VMD: Visual molecular dynamics Car-Parrinello molecular dynamics Efficient exploration of reactive potential energy surfaces using car-parrinello molecular dynamics A molecular modeling approach to identify effective antiviral phytochemicals against the main protease of SARS-CoV-2 Comparison of simple potential functions for simulating liquid water A two-pronged strategy to suppress host protein synthesis by SARS coronavirus Nsp1 protein Protein structure prediction and analysis using the Robetta server Discovery of a Novel Coronavirus, China Rattus Coronavirus HKU24, from Norway Rats supports the murine origin of Betacoronavirus 1 and has implications for the ancestor of Betacoronavirus Lineage A Therapeutic options for the 2019 novel coronavirus (2019-nCoV) Outbreak of pneumonia of unknown etiology in Wuhan, China: The mystery and the miracle Mutation of critical residues reveals insights of yellow fever virus nonstructural protein 1 (NS1) stability and its formation AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility Computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with SARS-CoV-2 protease against COVID-19 The endonucleolytic RNA cleavage function of nsp1 of Middle East respiratory syndrome coronavirus promotes the production of infectious virus particles in specific human cell lines Severe acute respiratory syndrome coronavirus nsp1 suppresses host gene expression, including that of Type I interferon, in infected cells Peptide-like and small-molecule inhibitors against Covid-19 UCSF Chimera-a visualization system for exploratory research and analysis The SARS-coronavirus-host interactome: Identification of cyclophilins as target for pan-coronavirus inhibitors Identification and characterization of severe acute respiratory syndrome coronavirus replicase proteins Characterization of a novel coronavirus associated with severe acute respiratory syndrome Diseases of swine A conserved region of nonstructural protein 1 from alphacoronaviruses inhibits host gene expression and is critical for viral virulence Severe acute respiratory syndrome coronavirus evades antiviral signaling: Role of nsp1 and rational design of an attenuated strain TM-align: A protein structure alignment algorithm based on the TM-score Coronaviruses -drug discovery and therapeutic options The authors declare no competing interests. http://orcid.org/0000-0002-8944-8148 Roosevelt Alves da Silva http://orcid.org/0000-0001-5624-4790