key: cord-337032-s4g4g80w authors: Gupta, Manoj Kumar; Vemula, Sarojamma; Donde, Ravindra; Gouda, Gayatri; Behera, Lambodar; Vadde, Ramakrishna title: In-silico approaches to detect inhibitors of the human severe acute respiratory syndrome coronavirus envelope protein ion channel date: 2020-04-15 journal: J Biomol Struct Dyn DOI: 10.1080/07391102.2020.1751300 sha: doc_id: 337032 cord_uid: s4g4g80w Recent outbreak of Coronavirus disease (COVID-19) pandemic around the world is associated with ‘severe acute respiratory syndrome’ (SARS-CoV2) in humans. SARS-CoV2 is an enveloped virus and E proteins present in them are reported to form ion channels, which is mainly associated with pathogenesis. Thus, there is always a quest to inhibit these ion channels, which in turn may help in controlling diseases caused by SARS-CoV2 in humans. Considering this, in the present study, authors employed computational approaches for studying the structure as well as function of the human ‘SARS-CoV2 E’ protein as well as its interaction with various phytochemicals. Result obtained revealed that α-helix and loops present in this protein experience random movement under optimal condition, which in turn modulate ion channel activity; thereby aiding the pathogenesis caused via SARS-CoV2 in human and other vertebrates. However, after binding with Belachinal, Macaflavanone E, and Vibsanol B, the random motion of the human ‘SARS-CoV2 E’ protein gets reduced, this, in turn, inhibits the function of the ‘SARS-CoV2 E’ protein. It is pertinent to note that two amino acids, namely VAL25 and PHE26, play a key role while interacting with these three phytochemicals. As these three phytochemicals, namely, Belachinal, Macaflavanone E & Vibsanol B, have passed the ADMET (Absorption, Distribution, Metabolism, Excretion and Toxicity) property as well as ‘Lipinski’s Rule of 5s’, they may be utilized as drugs in controlling disease caused via SARS-COV2, after further investigation. Communicated by Ramaswamy H. Sarma Coronaviruses (CoVs) are responsible for causing numerous diseases in broad ranges of vertebrates, including humans. Though earlier, CoVs were only associated with a common cold, for the first time in 2002, new CoVs related to the 'severe acute respiratory syndrome' (SARS-CoV) discovered in the human population of China and caused the death of 10% of the total cases worldwide (Perlman & Netland, 2009; Rota et al., 2003) . Recently, in December 2019, numerous patients from Wuhan, China, reported regarding symptoms like pneumonia. However, initially it was identified as novel coronavirus, namely, 2019-nCoV. Later, the World Health Organization (WHO) renamed that virus as 'severe acute respiratory syndrome coronavirus 2' (SARS-CoV2) and the diseased caused by them is known as 'coronavirus disease 2019' . On March 11, 2020, the WHO officially strand RNA genome of size 29.7 kb and encodes a viral replicase that is associated with the novel genome synthesis and generation of a 'nested set of sub-genomic messenger RNAs, encoding both structural proteins present in all CoVs: Spike (S), Envelope (E), Membrane (M) and Nucleoprotein (N), and a group of proteins specific for SARS-CoV: 3a, 3b, 6, 7a, 7b, 8a, 8b, and 9b'(Nieto-Torres et al., 2014) . While M and S protein constitutes the major portion of the viral envelope, E proteins are reported to oligomerize and form ion channels (Venkatagopalan et al., 2015) . The E protein is located along with S-spike glycoprotein (Li et al., 2020) , where it played a significant role in the assembly of the viral genome (Westerbeck & Machamer, 2019) . 'SARS-CoV E' protein' is a short, integral membrane protein comprised of 76-109 amino acids and its size range from 8.4 to 12 kDa. It start a with short hydrophilic terminal followed by large hydrophobic transmembrane domain and terminates with long hydrophilic carboxyl end. Hydrophobic region oligomerise and form an ion-conductive pore in membranes. This protein play key role in various phases of the virus' life cycle, such as envelope formation, pathogenesis, budding and assembly (Ashour et al., 2020; Schoeman & Fielding, 2019) . Few studies have also suggested that the 'SARS-CoV E' proteins' ion channel activity is modulated via pentameric ion channel (Pervushin et al., 2009 ). The 'SARS-CoV E' protein's ion channel activity is detected in the transmembrane region of the protein (Verdi a-B aguena et al., 2012; Wilson et al., 2004) . Selectivity, as well as ion conductance associated with the E protein ion channel, are mostly modulated via the lipid membranes charge within which the pores aggregates. This, in turn, supports that 'lipid headgroups' are the main component of the structure of the channel facing the pore's lumen (Verdi a-B aguena et al., 2012 (Verdi a-B aguena et al., , 2013 . Though the involvement of ion channels in CoVs pathogenesis remains a topic of debate, recently several studies have suggested that the absence of 'SARS-CoV E' protein results in an 'attenuated virus', thereby supporting that 'SARS-CoV E' protein is mainly responsible for pathogenesis (Pervushin et al., 2009; Regla-Nava et al., 2015) . The involvement of other proteins of SARS-CoV in pathogenesis remains unclear to date (Venkatagopalan et al., 2015) . Few studies have also reported that mutations within the extra-membrane domain of 'SARS-CoV E' protein disrupt the normal viral assembly as well as maturation (Fischer et al., 1998; Torres et al., 2007; Verdi a-B aguena et al., 2012) . In transmissible gastroenteritis virus, the 'SARS-CoV E' protein deletion led to virus trafficking blockage within the secretory pathway as well as virus maturation (Curtis et al., 2002) . Thus, the 'SARS-CoV E' protein server as a key biomarker for preventing pathogenesis associated with the SARS-CoV (Pervushin et al., 2009) . For gaining detail insight into the structure & function, to date numerous 'three-dimensional' structures of the 'SARS-CoV E' protein have been deposited in the 'Protein Data Bank (PDB)'. Furthermore, as identifying novel drugs through laboratory approaches demands huge capital as well as investments, there is a continuous demand for screening drug molecules via high screeening computational methods that saves both money as well as times Gupta & Vadde, 2020b; Gupta & Vadde, 2020a) . Nevertheless, ADMET ('Absorption, Distribution, Metabolism, Excretion and Toxicity') as well as 'Lipinski's Rule of 5s' property passed natural drugs may have either minimum or no adverse-effects Gupta & Vadde, 2020a,b) . Recently, genome of the first SARS-CoV2 (Wuhan-Hu-1) has also been successfully sequenced and submitted in GenBank (Accession no. MN908947.3) (Shang et al., 2020) . By considering the above information, in the present study, authors employed computational approach for identifying the best possible structure of the 'SARS-CoV2 E' protein present in the PDB database to understand its structure and function as well as its behaviour towards various phytochemicals. This, in turn, may help us in identifying few phytochemicals that may inhibit the function of the 'SARS-CoV2 E' protein; thereby preventing the pathogenesis associated with SARS-CoV2. In the near future, these phytochemicals may serve as a good contestants for treating diseases caused by SARS-CoV2 after further laboratory investigation. The Uniprot was utilized for downloading the human 'SARS-CoV E' protein sequence (ID: P59637). The NCBI protein database was utilized for downloading the human 'SARS-CoV2 E' protein sequence (ID: YP_009724392.1). Multalin, a webbased tool was implemented to detect difference between 'SARS-CoV E' and 'SARS-CoV2 E' protein. Subseqeuntly, sequence of 'SARS-CoV2 E' protein was subjected to NCBI's utility 'BLASTp' (Altschul et al., 1990) . Based on maximum sequence identity (81%), structure with PDB (Protein Data Bank) ID 5 Â 29 is detected as the best homologous structure of human 'SARS-CoV2 E' protein. As the 'SARS-CoV E' protein function as a homopentamer (Pervushin et al., 2009) , the complete structure of 5 Â 29, comprised of five chains, was employed for downstream analysis. To gain detail insight into the structural characteristic of the downloaded protein and removing any conflicts present between its atoms of main and side chain Gouda et al., 2020; Gupta & Vadde, 2020b) , MDS were employed using 'Gromos96-43a1 force field' of 'GROMACS 5.1' for 200 ns (Abraham et al., 2015) . As 'SARS-CoV2 E' protein resides is a transmembrane region, at first, downloaded structure of 5 Â 29 was embedded in the 'equilibrated bilayer of DPPC (dipalmitoylphosphatidylcholine)' using 'g_membed' tool of 'GROMACS 5.1' (Wolf et al., 2010) using the aid of 'Berger lipids' derived parameters from 'Berger, Edholm, and Jahnig' (Berger et al., 1997) . Further solvation of the entire system till energy minimization followed by equilibrating the complete system under NVT ('Constant Number of Particles, Pressure and Temperature') and NPT ('Constant Number of Particles, Volume, and Temperature') conditions were carried out as stated in detail at http://www.mdtutorials.com/gmx/ membrane_protein/index.html. Final molecular dynamic (MD) trajectories as well as the quality of simulations were analyzed via 'GROMACS 5.1' (Gupta & Vadde, 2020b) . 'Principal Component Analysis' (PCA) was also carried out for capturing the most flexible area and the motion of the a-helix & the b-strands present within the protein during 200 ns. Equilibrated conformers from the MD were used to produce the mean structure of the 'SARS-CoV2 E' protein (Gupta & Vadde, 2020b) . Backbone atoms' free energy for the 'SARS-CoV2 E' protein was calculated using the 'GROMACS 5.1' from 20 ns to 200 ns with 20 ns interval (Gupta & Vadde, 2020b) . 'Three-dimensional' structure of 4153 phytochemicals having 'drug-like' features from our earlier published literatures (Gupta & Vadde, 2019a , 2020a were employed in the present study. The DrugMint server The DrugMint server (Dhanda et al., 2013) was employed for detecting ADMET or 'drug-like' properties of each phytochemicals. DrugMint server predicts ADMET or druglikelihood of any drug/phytochemicals using various classification models. All models were trained, tested and evaluated on a dataset comprised of 3206 experimental drugs and 1347 approved drugs of DrugBank 2.5. All QSAR models were developed using open source software packages like PaDEL, WEKA, SVM_Light (Dhanda et al., 2013) 'Three-dimensional' structures of each phytochemicals were obtained from TIPdb database and have either anti-tuberculosis, anti-cancer, anti-platelet, or no therapeutic properties (Lin et al., 2013) . The active site pocket present in the 'SARS-CoV2 E' protein was calculated using the CASTp server (Binkowski et al., 2003) . Active pocket with the highest volume as well as area was considered for molecular docking studies with phytochemicals having 250 conformations Gupta & Vadde, 2020b) via the AutoDock (Morris et al., 2009) tool. Three best phytochemical having the minimal binding energy was considered for further study (Gupta & Vadde, 2020b) . Complex formation was done employing the 'Discovery Studio' Software (Biovia, 2017) . Amongst 250 conformations, the docking result of the 'SARS-CoV2 E' protein with three different phytochemicals having the minimal binding energy was considered for the 200 ns MDS, separately . The PRODRG server (Sch€ uttelkopf & van Aalten, 2004) was employed for generating the ligand's topology parameters. Subsequently, complete MDS was performed as described above in the MDS: Phase I. Using the GMXAPBS utility of the 'GROMAC 5.1', 2000 snapshots were retrieved from the MD trajectories of all the three complexes, individually, to calculate the binding free energy from 20 to 200 ns with 20 ns interval (Gupta & Vadde, 2020b ). Inter-molecular interactions present within the three complexes after 200 ns MDS were performed via Discovery Studio (Gupta & Vadde, 2020b) . Comparative sequence analysis via Multalin reveals that 'SARS-CoV E' and 'SARS-CoV2 E' protein sequence share 94.74% identity amongst themselves. The 'three-dimensional' structure of one unit of 'SARS-CoV2 E' comprised of only seven a-helices and eight loops (Figure 1(a) ). As there are five homo-units, the complete structure of 'SARS-CoV2 E' consists of 35 a-helices and 40 loops (Figure 1(b) ). As the 'SARS-CoV E' proteins' ion channel activity is modulated via pentameric ion channel (Pervushin et al., 2009) , complete structure comprising of five subunits was employed for the downstream analysis. To understand the structural characteristics of the 'SARS- Gupta et al., 2018; Mehrnejad & Chaparzadeh, 2008; Rani & Lakshmi, 2019; Tandon et al., 2015) were employed for estimating the protein stability during the MD analysis. The average RMSD value of the protein backbone atoms is estimated to be 2.74 Å (Figure 2(a) ). RMSD is found to be stable subsequently 170 ns. The value of RMSF fluctuates within 4 Å & 6.3 Å with an average value of 5.96 Å. Amino acids that undergo maximum fluctuation during 200 ns MDS were VAL17, ALA22, LEU19, LEU27, PHE23, PHE26, LEU27, VAL24, VAL25, VAL29, ILE33, ALA36 and TYR42 (i.e. RMSF > 6 nm). As these residues play an important role during protein-ligand interaction, they may serve as a biomarker during the drug discovery process (Gouda et al., 2020; Gupta & Vadde, 2019a, 2020a, 2020b) (Figure 2(b) ). The Rg values of the protein (Figure 2(c) ), fluctuates within 5.81 Å & 5.83 Å with an average value of 5.82 Å; thereby supporting its condensed architecture as well as size Gupta & Vadde, 2019b) . Result obtained from the PCA analysis suggests the random movement of the 'SARS-CoV2 E' protein ( Figure 2(d) ) throughout the 200 ns MDS. Mean number of 'intra-protein' hydrogen bond & 'inter-hydrogen' bond formed between 'SARS-CoV2 E' protein & water is 1.98 and 1.98, respectively. 'Cross-correlation matrix' of the C-a displacement revealed that all residues present within the 'SARS-CoV2 E' protein experience both negative (depicted via blue shades) as well as positive correlated motions (depicted via red shades) (Figure 3(a) ), which in turn support random movement of the 'SARS-CoV2 E' protein. This finding is also The random movement of the 'SARS-CoV2 E' protein indicates its involvement in the ion channels. This is in accordance with earlier studies where authors have reported that the ion channel activity of 'SARS-CoV2 E' proteins is modulated via pentameric ion channel (Pervushin et al., 2009) . It is pertinent to note that the structure of 'SARS-CoV2 E' protein becomes stable after 170 ns. Hence, the average structure of the 'SARS-CoV2 E' protein was obtained from the stable plateau of the RMSD after 170 ns for downstream analysis. Further, the CASTp server was employed for detecting the active site present within the 'SARS-CoV2 E' protein. The obtained result revealed that the highest volume and area of the binding cavity within the 'SARS-CoV2 E' protein is 7625.969 Å 3 & 7163.067 Å 2 , respectively. Forty-four amino acids that are involved in the formation of the active site are GLU8, THR11, LEU12, VAL14, ASN15, VAL17, LEU18, LEU19, PHE20, LEU21, ALA22, PHE23, VAL24, VAL25, PHE26, LEU27, LEU28, VAL29, THR30, LEU31, ALA32, ILE33, LEU34, THR35, ALA36, LEU37, ARG38, LEU39, ALA40, TYR42, ALA43, ALA44, ILE46, VAL47, VAL49, LEU51, PRO54, VAL56, TYR57, SER60 , ARG61, LYS63, ASN64 and LEU65. Subsequently, molecular docking between the protein and ligand with 250 conformations was performed using the AutoDock tool. 126 Â 126 X 126 was assigned as a grid box with a grid center of À0.496, 0.0 and 0.0. The grids were selected very carefully for allocating active sites along with the surrounding surface's major area Gouda et al., 2020; . Subsequently, molecular docking of the 'SARS-CoV2 E' protein with ligands having 250 conformations using the AutoDock tool revealed that the best ten phytochemicals with minimal binding energy are TIP006452 (Belachinal), TIP005365 (Macaflavanone E), TIP003272 (Vibsanol B), TIP003258 (14 R Ã ,15-Epoxyvibsanin C), TIP005363 (Macaflavanone C), TIP000749 (Luzonoid D), TIP008605 (Grossamide K), TIP009461 ((-)-Blestriarene C), TIP005366 (Macaflavanone F) and TIP005783 (Dolichosterone). Binding energy of 'SARS-CoV2 E' protein with TIP006452, TIP005365, TIP003272, TIP003258, TIP005363, TIP000749, TIP008605, TIP009461, TIP005367, TIP005366 and TIP005783 is À11.46 kcal/mol, À11.07 kcal/mol, À11.07 kcal/mol, À10.56 kcal/mol, À10.49 kcal/mol, À10.47 kcal/mol, À10.50 kcal/ mol, À10.40 kcal/mol, À10.40 kcal/mol, À10.36 kcal/mol and À10.31 kcal/mol, respectively (Supplementary File I and Table 1 ). Binding energy of all 4153 phytochemicals is depicted in the Supplementary File I. Earlier one study has reported that Macaflavanone C may be utilized for treating Alzheimer's disease (Gupta & Vadde, 2019a) . In 2016, Teponno and the team reported about the anti-melanogenic property of the Grossamide K (Bertrand Teponno et al., 2016) . Earlier Seo and his team have also reported that compounds present in the ethyl acetate fraction of Sanguisorba officinalis, namely euphormisin M3, arjunglucoside II, pomolic acid-3-b-O-a-L-arabionopyranoside, 3,3 0 -di-O-methylellagic acid, 3,8-dihydroxy-4,10-dimethoxy-7-oxo-[2] benzopyrono [4,3-b] [1]benzopyran-7-(5H)-one, chikusetsusaponin II, deglucose chikusetsusaponin Iva, belachinal, irilone and ellagic acid, may have the inhibitory effect on inflammasome pathways and protective role in endotoxininduced septic shock. However, to best of our knowledge, no medicinal properties have been assigned to rest seven phytochemicals, namely, Macaflavanone E, Vibsanol B, '14 R Ã ,15-Epoxyvibsanin C', Luzonoid D, (-)-Blestriarene C, Macaflavanone F and Dolichosterone. As three compounds, namely, Belachinal, Macaflavanone E and Vibsanol B, have the minimal binding energy with the 'SARS-CoV2 E' protein (Table 1) , three separate complexes between 'SARS-CoV2 E' protein and each ligand were done separately using Discovery Studio Software. The complex of the 'SARS-CoV2 E' protein with Belachinal, Macaflavanone E and Vibsanol B, separately, will be known as Complex A, B and C, respectively, henceforth. (Figure 2(a) ). RMSD in all the three complexes is found to be stable subsequently 180 ns. The Rg values of all three complexes ( Figure 2 (c)) support their condensed architecture as well as size Figure 2(d) ). This might be due to decreased potential energy and increased pressure in all the three complexes, which in turn supports that the random movement of the 'SARS-CoV2 E' protein decrease after binding with these phytochemicals (Figure 2(d) ); thereby supporting that these three phytochemicals may function as drugs for treating or controlling diseases caused via SARS-CoV2 in human. The obtained result revealed that the final binding energy of Complex A (À250.979 ± 0.272 kJ/mol)