key: cord-0826720-tv2ptom9 authors: Maroli, Nikhil; Bhasuran, Balu; Natarajan, Jeyakumar; Kolandaivel, Ponmalai title: The potential role of procyanidin as a therapeutic agent against SARS-CoV-2: a text mining, molecular docking and molecular dynamics simulation approach date: 2020-09-22 journal: Journal of biomolecular structure & dynamics DOI: 10.1080/07391102.2020.1823887 sha: a7b51e883187e0781a52c1161eda4306d67f90f6 doc_id: 826720 cord_uid: tv2ptom9 A novel coronavirus (SARS-CoV-2) has caused a major outbreak in human all over the world. There are several proteins interplay during the entry and replication of this virus in human. Here, we have used text mining and named entity recognition method to identify co-occurrence of the important COVID 19 genes/proteins in the interaction network based on the frequency of the interaction. Network analysis revealed a set of genes/proteins, highly dense genes/protein clusters and sub-networks of Angiotensin-converting enzyme 2 (ACE2), Helicase, spike (S) protein (trimeric), membrane (M) protein, envelop (E) protein, and the nucleocapsid (N) protein. The isolated proteins are screened against procyanidin-a flavonoid from plants using molecular docking. Further, molecular dynamics simulation of critical proteins such as ACE2, Mpro and spike proteins are performed to elucidate the inhibition mechanism. The strong network of hydrogen bonds and hydrophobic interactions along with van der Waals interactions inhibit receptors, which are essential to the entry and replication of the SARS-CoV-2. The binding energy which largely arises from van der Waals interactions is calculated (ACE2=-50.21 ± 6.3, Mpro=-89.50 ± 6.32 and spike=-23.06 ± 4.39) through molecular mechanics Poisson-Boltzmann surface area also confirm the affinity of procyanidin towards the critical receptors. Communicated by Ramaswamy H. Sarma Introduction 2014). The S protein initiates the receptor binding in the S1 subunit and S2 subunit that fuses and initiates the attachment to the host membrane. It is revealed that angiotensinconverting enzyme-2 (ACE2) is one of the major binding receptors of the SARS-CoV-2 virus in the lungs, where it was dipeptidyl peptidase-4 in case of MERS-CoV (Guo et al., 2008; F. Li, 2016; W. Li et al., 2003; McBride et al., 2014; Mubarak et al., 2019) . The ACE2 receptor mainly expresses in the lung alveolar epithelial cells and its main function is to reduce the blood pressure through catalyzing the hydrolysis of angiotensin-II. Once the protein is bound with the receptor the nucleocapsid is released into the cytoplasm of the host cells, which contains ORF1a and ORF1b genes. ORF1a and ORF1b genes produce pp1a and pp1b that use host ribosomes for the translation process. The process of PPs leads to the production of 16 NSPs and all these NSPs having the indistinct function in the host cells such as suppression of host gene expression, formulation of multi-domain complexes and some of them act as primase (Stobart et al., 2013; Te Velthuis et al., 2012) . M, E, and S proteins enter into the endoplasmic reticulum (ER)-Golgi intermediate compartment (ERGIC) complex and produce viral envelope. At the same time, the ribonucleic-protein formation will occur once the replicated genomes bind to the N protein. Once the virus particles came out from ERGIC, the vesicle fuses with the plasma membrane and releases of the virus particles into the extracellular region occur. Transition between a metastable prefusion state to a stable post-fusion state is triggered by the binding of spike protein with ACE2. Several researchers studied and reviewed the binding mechanism of spike protein with ACE2 receptor and screened small molecule and other inhibitors towards different targets (Boopathi et al., 2020; Brielle et al., 2020; Freitas et al., 2020; Gupta et al., 2020; Jin et al., 2020; Muralidharan et al., 2020; Xue et al., 2020; Zhou et al., 2020) . Also, the receptor-binding domain (RBD) dynamics and development of inhibitor targeting the RBD domain (Tai et al., 2020) and suggested the potential role of the RBD domain towards the therapeutic development against SARS CoV2. Procyanidin is a member of proanthocyanidin that belongs to the class of flavonoids-the secondary metabolites of polyphenolic plant and fungus. Proanthocyanidins are the most abundant polyphenolic compounds after lignin (Souquet et al., 1996) . This oligomeric compound formed from catechin and epicatechin molecules that mainly found in apples, maritime pine bark, aronia fruits, cinnamon, and grapes (Pacheco-Palencia et al., 2008; Vivas et al., 2006; Yang & Xiao, 2013) . Though, the dietary content of procyanidin is not yet studied well. Some of the preliminary studies show the potential role of procyanidins a cardioprotective and anticancer agent (Bae et al., 2020) and the inhibition of ACE (Actis-Goretta et al., 2003) . The large growing scientific literature collection of COVID 19 requires automated approaches for a better understanding of the disease. Text mining is one of the prominently adopted approaches in both academia and industry to extract information from the collection of literature (Bhasuran et al., 2016; . In this big data era, text mining is increasingly applied in various streams of science and technology for hypothesis generation and knowledge discovery. The text mining approach is implemented to identify the important COVID 19 genes/proteins and the biological connections between them. Particularly text mining methodology has been adapted to find the role of highly studied COVID-19 genes/ proteins such as ACE2, NSP (1-10), Helicase, ORF, etc. To identify these genes/proteins and the biological interlink between them, a curated dictionary-based gene/protein entity mapping and co-occurrence-based text mining approach is performed on the COVID-19 scientific literature set. In this study, we explored the potential role of procyanidin as a therapeutic agent against SARS-CoV-2 using molecular docking and molecular dynamics simulation methods. The selection of procyanidin is made through the screening of plant-based compounds collected from various literature along with the output from the literature mining. The proteins associated with SARS-CoV-2 were identified through named entity recognition method and 24 proteins are selected from the result which shows the higher frequency of the threshold value. The protein structures were collected from PDB databank and modeled using homology modeling. The molecular docking was performed to assess the binding affinity of procyanidin with the selected proteins. Further, crucial proteins such as ACE2, Mpro, and spike that directly involved in the entry of SARSCoV2 are selected for the MD simulation to reveal the inhibition mechanism of procyanidin. The MM-PBSA method was adopted to decipher the binding energy of the procyanidin. The text mining pipeline employed in the current study follows as, COVID 19 literature data collection, named entity recognition of gene/proteins, gene-gene co-occurrence extraction, construction of a gene-gene interaction co-occurrence network and analysis. In the present study, we used the COVID-19 Open Research Dataset (CORD-19) (https://pages.semanticscholar.org/coronavirus-research) downloaded on April 10 th , 2020 as the literature data source. CORD-19 is a growing data source and is compiled with COVID-19 literature data and the past coronavirus research literature. The data set was created using a query-based search using "COVID-19" OR "Coronavirus" OR "Coronavirus" OR "2019-nCoV" OR "SARS-CoV" OR "MERS-CoV" OR "Severe Acute Respiratory Syndrome" OR "Middle East Respiratory Syndrome". Scientific literature matching this keyword query was collected from a total of 3200 journals. According to the developers the CORD-19 dataset consisting of research literature spanning various biological domains such as Virology (42.3%), Immunology (20.7%), Molecular biology (12.7%), Genetics (8.0%), Intensive care medicine (6.7%) and Others (9.6%). The current study used a literature set consist of nearly 49,000 articles collected from CORD-19 literature corpus. Text mining is applied to the COVID-19 scientific literature set consist of nearly 49,000 articles collected from . Identification of which genes are playing an important role in disease from the scientific literature is a crucial task. In the current study, a dictionary-based named entity recognition (NER) approach is used to find the gene and protein entities mentioned in the COVID 19 scientific literature set. The dictionary was created using a manual screening of proteins mentioned as drug targets/biomarkers of COVID-19 in various scientific publications. The major proteins in the list are Angiotensinconverting enzyme 2 (ACE2), envelop (E) protein, E protein, Helicase, ExoN, M protein, main protease, membrane (M) protein, N protein, NendoU, Non-structural protein(1-16), ORF(1a, 1b, 3a, 6, 7a, 8, 10) , RNA-directed RNA polymerase(RdRp),S protein, spike (S) protein, nucleocapsid (N) protein, polyproteins, PP1a, PP1ab, Papain-like proteinase, and surface glycoprotein. A sentence level co-occurrence approach is based on the fact that two entities mentioned together in a sentence poses some form of relatedness. The current study employs a frequency-based co-occurrence approach and its aggregation to find the biological interlinks between important COVID 19 genes/proteins by text mining the CORD-19 corpus. To identify and extract the genes/proteins and interactions we have adapted our previous methodology . Identified gene/protein entities and their relations were then subjected to create a gene-gene interaction network using the visualization tool Gephi 0.9.2 (Bastian et al., 2009 ). Finally, various network analysis algorithms and parameters such as node degree, centrality measures, page rank, and sub-network analysis were performed on the COVID-19 gene-gene interaction network to identify the hub genes/proteins and highly dense genes/protein clusters. Centrality measures are used to identify the most important highly connected nodes within a graph. In the current study, three types of centrality measures such as closeness centrality, betweenness centrality, and eigenvector centrality were calculated from the generated gene-gene interaction network. In general, closeness centrality searches for the shortest path between the nodes, betweenness centrality calculates a node ability to connect the shortest path between any two nodes, and eigenvector measures the influence of a node based on its connectivity in the network. In the current study, two other important analyses performed on the network namely PageRank and Clustering. PageRank is one of the popular numerical weighting-based link analysis algorithms used for measuring the relative importance of a particular node in a given network based on its network connectivity. The schematic architecture of the text mining pipeline is depicted in Figure 1 . The molecular docking of procyanidin with selected proteins was performed using Autodock VINA implemented in the SAMSON software package (NANO-D & I, 2016; Trott & Olson, 2010) . The procyanidin molecule was obtained from the PubChem database and geometry optimization was performed using the Gaussian 09 software package (Frisch et al., 2009) . The protein structure of ACE2, Mpro, and spike proteins are obtained from the respective RSCB PBD databank. The water molecules and other ligands in protein structures are removed before the docking. And the docking protocols are adapted from our previous studies . Further, the receptor-procyanidin docked complexes were used to perform molecular dynamics simulation using the GROMACS 2020.1 software package (Abraham et al., 2015) .The TIP3P water model with charmm36 forcefield (J. Huang & MacKerell, 2013; Mark & Nilsson, 2001) is used for the molecular dynamic's simulation with a 2-fs time step. The particle mesh Ewald method (Essmann et al., 1995) was used for the calculation of electrostatic and interactions. The Verlet cut-off distance of 1.4 mm was used for the shortrange repulsive and attractive interactions. The temperature and pressure were maintained using the Nose-Hoover temperature coupling and Parrinello-Rahman algorithm at 310 K and 1 bar, respectively. The LINCS algorithm (Hess et al., 1997) was used to constrain all bond lengths and simulations were performed in the NPT ensemble for 100 ns. The equilibration of 10 ns in NVT and NPT ensembles were performed after the minimization of 5000-time steps. The trajectories are saved in every 100 ps and visualization was performed using UCSF Chimera and Ligplot software packages (Pettersen et al., 2004; Wallace et al., 1995) . The binding energy of procyanidin with ACE2, Mpro, and spike proteins were calculated using the MM-PBSA method implemented in the g_mmpbsa tool (Kumari et al., 2014) . The binding free energy of the protein-ligand complex in a solvent is defined as Where E complex denote the total binding energy of proteinprocyanidin complex, E protein and Eprocyanidin represent the total energy of the separated protein and procyanidin in the solvent, respectively. hDE MM i is the total molecular mechanic's energy in the gas phase, hDG solv i is salvation free energy, TDS is entropy, and hDE MM i is the sum of electrostatic and van der Waals interaction energy. Polar contributions were calculated using the PB model and nonpolar energy is estimated by solvent accessible surface area (SASA); TDS is considered as negligible. Here we have adopted molecular docking followed by molecular dynamics simulation to reveal the binding mechanism of procyanidin with the proteins/enzymes associated with COVID 19. The proteins or enzymes associated with COVID 19 were identified through the named entity recognition method using an in-house developed Java script. Dictionary-based named entity recognition (NER) approach was used to reveal the occurrence of protein/genes associated with COVID 19. We used in-house developed Java regular expression-based matching for the identification and extraction of gene and protein entities mentioned, and cooccurring genes/protein pairs in the COVID-19scientific literature set. The NER phase identified a total of 45,984 mentions with 38 major proteins with high reference. A detailed representation of the highly referenced genes is given in Figure 2a and liner frequency of the important proteins related to COVID 19 and the top 10 proteins related are depicted in Figure 2c& d, respectively. In the co-occurring extraction phase, we identified the genes that are co-occurring in high degrees up-to eleven from a single sentence ( Figure 2b ). Also, highly interacting benchmarking gene pairs were identified and the top 10 results are given in Table 1 . There is a total of 2,327 entries are connected with procyanidin in PubMed and these abstracts are collected in text format. In this collection, we searched for various biological entities and activities of the molecule. The results revealed that procyanidin regulates oxidative stress and connected with the abnormal inflammatory response. The major diseases connected with procyanidin are inflammation, cancer, diabetes, obesity, atherosclerosis, cardiovascular disease, and hypertension. The major genes/proteins extracted from the Figure 1 . The schematic architecture of the text mining approach for the identification of critical proteins and their biological co-occurrence on COVID-19. The text mining approach employed the CORD-19 database as the literature source. A set of highly studied COVID-19 protein list was manually constructed and a dictionary matching procedure was applied to find protein co-occurrence with frequency. Finally, a protein co-occurrence network was created and analyzed to find the functional association among critical proteins in procyanidin literature set are, TNF, GSPB2, Nrf2, NF-kappaB, IL-6, CASP3, INS, ACE, COX-2, and p53. The co-occurrence-based score was used to generate a genegene interaction network. The network (Figure 2c ) consists of 38 nodes (genes) and 209 edges(interactions). Analysis of the network has revealed the 13 nodes as the major nodes with higher interlink to other nodes. The network degree of each node is given in Figure 1 of supporting information. From the network, it is evident that the connection exists between ACE2 and S-protein and likewise proteins as shown in Table 1 . Various network analysis properties of the COVID 19genegene interaction network also given in Figure 3 . It is further revealed that a strong association between ORF1a and ORF1b genes/proteins along with the proteins that are associated with SARS-CoV-2. We also found that RNA-directed RNA polymerase (RdRp) is strongly connected with all the NSPs. The top ten proteins from various network analyses are found to be RdRp, NSP2, NSP1, NSP12, N protein, ORF1a, NSP3, NSP5, Helicase, and NSP8. To evaluate these highly dense nodes, sub-networks were extracted from the main network. Specifically, sub-networks for genes such as ACE2, RDRp, S-protein, M-protein, N-protein, and E-protein were extracted. The extracted sub-networks are given as ACE2, Helicase S-protein, E-protein, Mprotein and N-protein (( Figure 2 (a-f) of supporting information) and the corresponding connected nodes in these subnetworks are given in Table 2 with network analysis measurements. The procyanidin structure was obtained from PubChem and optimized using B3LYP functional and 6-31 G(d,p) basis set of DFT formalism (Becke, 1988; Kumari et al., 2014; Lee et al., 1988; Xu & Goddard, 2004) . The optimized geometry of procyanidin was used to perform molecular docking on selected proteins as receptors. The proteins PDB structure is obtained from PDB databank and the remaining proteins were modeled by homology modeling implemented in I-TASSER webserver. The modeled structure was solvated using the TIP3P water model and subjected to a 10 ns MD simulation to remove steric hindrances and unfavorable interactions. The 3-D structure for the modeled proteins are given in Figure 3 (i)-(iv) of supporting information. Molecular docking revealed a stronger affinity of procyanidin towards these proteins (Table 3) . It is evident from previous experimental studies that the inhibition of NSP's help to reduce the host translation and further reduction in virus replication. The strong interaction energy of the procyanidin molecule help to inhibit these proteins by changing their structural and functional properties. In all the docked complexes strong hydrogen and hydrophobic interactions networks are identified and provided in Table 3 . The interaction network of the proteins is provided in Figures 3 and 4 of supporting information. The hydrogen bonds between residues and procyanidin disrupt the normal structure of the proteins that fail the protein or enzyme to perform its native functions. The binding site of procyanidin with all proteins rich in hydrogen bonds and hydrophobic interaction apart from conventional electrostatic and van der walls interactions. Among the 24 screened proteins we have considered ACE2, Mpro, and spike protein separately as it is the crucial proteins that directly linked to the entry of SARS-CoV-2. Molecular docking has revealed that procyanidin forms hydrogen bonds with residues Ser44, Ser47, Asp350, Asp382, Tyr385, Arg393, Asn394, His401 and hydrophobic interaction with Phe40, Phe390 in ACE2 (Figure 4) . The higher number of these interactions depicts the strong affinity of procyanidin with ACE2. The residues Asn394, Gly395, Ser43, Leu351, Met62, Ser47, Asn51, His378, Ala348, Trp69, and Leu391 shows van der Waals interaction with the oxygen atom and the benzene ring of the procyanidin at the binding site. Further, our docked pockets are in agreement with reported binding sites of ACE2 reported through experimental studies (Akif et al., 2010 (Akif et al., , 2011 Natesh et al., 2003; Towler et al., 2004) . Also, the binding site we reported is overlapping with the several sites reported from the PDB database. The active site of ACE2 with carboxypeptidase shows the residues Tyr510, Cys344, His345, Pro346, Met360, Lys363, Arg273, Asp368, Thr371, Tyr515 and Arg514 which is in the same site where procyanidin binds. Moreover, here we only emphasized the active sites which shows the highest binding energy for procyanidin. Mpro with procyanidin shows eight hydrogen bonds with residues Ser44, Ser47, Asp350, Asp382, Tyr385, Arg393, Asn394, His401, and hydrophobic interaction with Phe40, Phe390 at the binding site ( Figure 5 ). Also, Met49 forms a pisulfur bond with the oxygen atom of procyanidin and pi-alky interaction between the benzene ring and Cys145. Further, the procyanidin interaction with spike protein shows hydrogen bond with Ser375, Thr376, Gly404, Asp405, Arg408, Ile410, and hydrophobic interaction with Thr376, Val407, Arg408 residues ( Figure 6 ). The Lys378 and Asp405 residue at the binding site shows pi-cation and pi-anion interaction with the ring structure of procyanidin. The docked position and interaction energy of this protein indicate the possibilities of procyanidin as a potent drug candidate against the novel SARS-CoV-2. To understand the structural and dynamical features of the protein with the procyanidin a 100 ns molecular dynamics simulation in water was performed. The best-docked poses based on the docking score were used as initial conformation for the simulation. 100 ns molecular dynamics simulation of apo and procyanidin bound form of ACE2, Mpro, and spike proteins have been performed. The structural stability of the proteins was assessed using RMSD, RMSF, and principal component analysis. The secondary and tertiary structural changes of the proteins due to the interaction of procyanidin also analyzed from the trajectories. The structural stability of the proteins was assessed using root mean square deviation (RMSD) and fluctuations (RMSF) of backbone and Ca atoms. The fluctuations of backbone and Ca atoms in both complexed and apo-protein structures are evaluated from 100 ns simulation trajectories and depicted in Figure 4 of supporting information. The procyanidin bound ACE2 shows an average backbone fluctuation of 0.4 nm whereas the apo structure shows 0.2 nm. The Mpro and spike proteins show an average of 0.3 and 0.19 nm in backbone RMSD whereas procyanidin bound form shows an average of 0.4 nm with the highest fluctuation at 0.9 nm for Mpro and 0.6 nm for spike protein. The higher fluctuation of the protein while interacting with procyanidin indicates its lower stability that will lead to the non-functional state of the protein. It is essential to maintain the native structure of the ACE2 receptor for the strong attachment of spike protein and further progression of the virus replication. To understand the essential dynamics of these proteins we have performed the principal component analysis of Ca atoms from the 100 ns simulation trajectories. The PCA based Ca atoms of the proteins represent the eigenvectors of the covariance matrix that is argued by its coincident eigenvalues and total concerted motion of the protein. As a result, the PCA based RMSF calculation help to reveal the dynamics of the critical residues that take part in the inhibition of the protein. The PC1's and PC2's of three proteins show lower fluctuations at the native form than the procyanidin-bound state (Figure 7) . The higher RMSF fluctuation of residues at the binding site indicates the conformational changes of the residues that destabilize the protein from its native structure. The binding site is identified as the target location for the spike protein and other virus proteins. The number of hydrogen bond forming between procyanidin and surrounding residues of each protein has been calculated and provided in Figure 8b . An average of 5-12 hydrogen bonds was present in all the system throughout the simulation time, which is an indication of strong binding of procyanidin with protein. The hydrogen bonds are calculated based on the cut-off criteria: (a) donor À acceptor distance and last frame shows 54.98 kJ/mol with four hydrogen bonds. Here, the starting frame and last frame of the simulation shows the presence of hydrogen bond which stabilize the procyanidin at the binding site. Further, the interaction energy such as electrostatic and van der Waals at the binding site have been calculated from the simulation trajectories (Figure 8c &d) . The van der Waals interaction energy is dominant at the binding site of all proteins than electrostatic interaction energy. ACE2 shows an average of À145.06 kJ/ mol and À78.51 kJ/mol of van der Waals and electrostatic interaction energy respectively. The binding site of ACE2 consists of residues such as Ser44, Ser47, Asp350, Asp382, Tyr285, Arg293, Asn394, His401, Phe40, Phe390, Ala36, Glu37, Phe40, Tyr41, Ser43, Ser44, Trp48, Gly352, and Phe400 that are more favorable towards the van der Waal interaction. The Mpro and spike protein show an average of À163.05 kJ/ mol,-106.7 kJ/mol of van der Waals À108.9 kJ/mol and À84.50 kJ/mol of electrostatic interaction energy respectively. The binding site of Mpro consists of Leu67, Val68, Gln69, Leu57, Tyr54, Pro39, Val20, Gln19, Ser144, Ser147, Leu141, Cys117, His163, and His41. The presence of polar and nonpolar residues at the binding site provides strong affinity energy towards the procyanidin. The spike protein binding sites consist of residues such as Ile410, Ala31, Arg408, Val407, Gly404, Thr376, Lys378, Tyr380, Gln414, Thr376. Altogether, strong interaction energy and positioning of the procyanidin at the deep inside the proteins make them a strong inhibitor. Secondary and tertiary structure of the receptors 14.1% turn, 32.7% coilin its native state, and end of the simulation with procyanidin shows a transformation of 26.8% helix, 30.1% sheet, 8.8% turn, 34.3% coil. The spike protein is rich in b-sheets with 2.9% helix, 27.3% sheet, 16.7% turn, 50.7% coil, 2.4% 3-10 helix and final structure shows 6.9% helix, 30.7% sheet, 2.1% turn, 58.2% coil, 2.1% 3-10 helix. The time-dependent transformation of the secondary structure of the proteins shown in Figure 5 of supporting information. The transformation between residues in the range of 40-50, 280-350, and 400-410 indicates the binding site where these residues are directly interacting with the proteins which are consistent with the principal components' analysis. The secondary structure of the protein is a key factor that controls the function of the proteins or enzymes. The transformations of the secondary structure are directly correlated to the functional changes of the proteins. Further, the radius of gyration of the proteins was calculated for apo and procyanidin bound state and depicted in Figure 8a . The radius of gyration of the protein determines the compactness of the protein, the higher fluctuations in gyration values were observed for molecule bound state, which is in agreement with time-dependent secondary structure analysis. Further, the tertiary structure of the protein is assessed by calculating the mean square distance between the backbone atoms ( Figure 9 , 10, 11 e & f). The backbone RMSD matrix of all the protein shows a significant deviation from the apostate to the procyanidin-bound state. The red color indicates the shorter distances between backbone atoms and blue color indicates the larger distances equivalent to 0.54 nm. When the procyanidin interacts with the protein blue color appeared on the matrix map is an indication of the larger pairwise distance between the backbone atoms. Further, we have superimposed the tertiary structure of the protein in both apo and bound state and observed the transformation from its initial frame (Figure 9 , 10, 11 c& d). The first frame to the last frame of ACE2 apo shows a 0.3 nm of RMSD over all atoms. Whereas, apo to bound state and first frame-of bound state to the last frame shows an RMSD of 0.5 nm and 0.45 nm respectively. The first to last frame of apo to procyanidin-bound (last frame) and procyanidin-bound first to last shows an RMSD of 0.37 nm, 0.47 nm, and 0.39 nm respectively. The spike protein shows the highest RMSD value in the superposed structure (0.74 nm) with apo to the procyanidinbound frame. Also, the native spike protein with the last frame shows 0.34 nm and procyanidin-bound first frame to the last frame shows 0.4 nm, respectively. The binding affinities of the molecule with three proteins were calculated using the MM-PBSA method implemented in the g_mmpbsa tool. The contribution from electrostatic, van der Walls, polar, and SASA indicates the strong affinity of the molecule with protein. Also, calculated binding energy through the MM-PBSA method is agreed with the calculated non-bonded interaction energy. All the protein systems showed a higher contribution from van der Waals energy which is favorable to the binding energy. The energy components that contribute to the total binding energy shows favorable interactions through van der Waals, electrostatic In all the cases high van der Waals interaction energies are observed. E pol represents the polar solvation energy, E apo represents apolar solvation energy E ele and E vdw represents the electrostatic and van der Waals interaction energy. Delta-E represents the total binding energy. and SASA energies whereas polar solvation energy show repulsive or unfavorable towards the binding energy. The favorable interaction energies arise from the hydrophobic and polar residues at the binding site. The total binding energy for ACE2, Mpro, and spike proteins are found to be stronger and in agreement with other analyses. The contribution from residues at the binding site of each protein is depicted in Figure 12a , b &c. The residues such as Lue73, Leu120, Asn121, and Ser124 residues of ACE2 show higher contribution ($ 7 kJ/mol) towards the binding energy. However, we have observed many residues contribute between 1-5 kJ/mol (e.g.: Cys44, Met49, and Asp187) which stabilize the procyanidin in the binding pocket of ACE2, Mpro, and spike protein. The residues Val503, Tyr505 were identified for the Mpro protein and Val503, Tyr 505 for spike proteins (Table 4 ). In this study, we have constructed the critical proteins network in the COVID 19 pathways through the named entity recognition method. The selected proteins were used to target againstprocyanidin, and molecular docking revealed the high affinity towards the selected major receptors. Molecular dynamics simulation revealed changes in the dynamics of receptors like ACE2, Mpro and spike in the presence of procyanidin. The structural changes induced by the procyanidin are sufficient to inhibit the proteins to prevent the progression of the disease. The strong van der Waals interaction at the binding site are the major reason for the high affinity of the procyanidin in all the receptors. Also, strong hydrogen bonds and hydrophobic interactions stabilize the procyanidin in the binding cavity. Further, site specific targeted study of these proteins will help to understand the activity of the proteins. Our findings help to design an in vitro and in vivo studies of procyanidin as a potent therapeutic agent in near future. No potential conflict of interest was reported by the author(s). Nikhil Maroli (NM) designed work performed molecular modelling and simulation and prepared the manuscript. BaluBhasuran (BB) performed text mining and PonmalaiKolandaivel (PK) and Jeyakumar Natarajan (JN) supervised the work. All the authors reviewed the manuscript. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers Inhibition of angiotensin converting enzyme (ACE) activity by flavan-3-ols and procyanidins High-resolution crystal structures of Drosophila melanogaster angiotensin-converting enzyme in complex with novel inhibitors and antihypertensive drugs Novel mechanism of inhibition of human angiotensin-I-converting enzyme (ACE) by a highly specific phosphinic tripeptide Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra alpha-helical domain Procyanidin C1 Inhibits Melanoma Cell Growth by Activating 67-kDa Laminin Receptor Signaling. Molecular Nutrition & Food Research Gephi: An open source software for exploring and manipulating networks Third International AAAI Conference on Weblogs and Social Media Density-functional exchange-energy approximation with correct asymptotic behavior Automatic extraction of gene-disease associations from literature using joint ensemble learning Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases Text mining and network analysis to find functional associations of genes in high altitude diseases Novel 2019 coronavirus structure, mechanism of action, antiviral drug promises and rule out against its treatment Coronavirus genome structure and replication The SARS-CoV-2 exerts a distinctive strategy for interacting with the ACE2 human receptor Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study A smooth particle mesh Ewald method Characterization and noncovalent inhibition of the deubiquitinase and deISGylase activity of SARS-CoV-2 papain-like protease Pathogenetic mechanisms of severe acute respiratory syndrome In-silico approaches to detect inhibitors of the human severe acute respiratory syndrome coronavirus envelope protein ion channel LINCS: A linear constraint solver for molecular simulations From SARS to MERS: Crystallographic studies on coronaviral proteases enable antiviral drug design Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The lancet CHARMM36 all-atom additive protein force field: Validation based on comparison to NMR data Structure of M pro from SARS-CoV-2 and discovery of its inhibitors A, Open Source Drug Discovery Consortium (2014). g_mmpbsa-a GROMACS tool for high-throughput MM-PBSA calculations Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density Structure, function, and evolution of coronavirus spike proteins Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K Combined Inhibitory Effects of Citrinin, Ochratoxin-A, and T-2 Toxin on Aquaporin-2 Molecular Mechanism of T-2 Toxin-Induced Cerebral Edema by Aquaporin-4 Blocking and Permeation The coronavirus nucleocapsid is a multifunctional protein Middle east respiratory syndrome coronavirus (MERS-CoV): Infection, immunological response, and vaccine development Computational studies of drug repurposing and synergism of lopinavir, oseltamivir and ritonavir binding with SARS-CoV-2 Protease against COVID-19 SAMSON-Software for adaptive modeling and simulation of nanosystems Crystal structure of the human angiotensin-converting enzymelisinopril complex Chemical composition, antioxidant properties, and thermal stability of a phytochemical enriched oil from Acai (Euterpe oleracea Mart Full-genome evolutionary analysis of the novel corona virus (2019-nCoV) rejects the hypothesis of emergence as a result of a recent recombination event. Infection, Genetics and Evolution : journal of Molecular Epidemiology and Evolutionary Genetics in Infectious Diseases UCSF Chimera-a visualization system for exploratory research and analysis Polymeric proanthocyanidins from grape skins Chimeric exchange of coronavirus nsp5 proteases (3CLpro) identifies common and divergent regulatory determinants of protease activity Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: Implication for development of RBD protein as a viral attachment inhibitor and vaccine The SARS-coronavirus nsp7 þ nsp8 complex is a unique multimeric RNA polymerase capable of both de novo initiation and primer extension ACE2 X-ray structures reveal a large hinge-bending motion important for inhibitor binding and catalysis AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Proanthocyanidins from Quercus petraea and Q. robur heartwood: Quantification and structures LIGPLOT: A program to generate schematic diagrams of protein-ligand interactions CORD-19: The Covid-19 Open Research Dataset A new coronavirus associated with human respiratory disease in China The X3LYP extended density functional for accurate descriptions of nonbond interactions, spin states, and thermochemical properties A Reverse-Transcription Recombinase-Aided Amplification Assay for Rapid Detection of the 2019 Novel Coronavirus (SARS-CoV-2) Grape phytochemicals and associated health benefits Biological and genetic characterization of a hemagglutinating coronavirus isolated from a diarrhoeic child Networkbased drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2