key: cord-0722764-0pi6i2es
authors: Saha, Sovan; Chatterjee, Piyali; Nasipuri, Mita; Basu, Subhadip
title: Detection of spreader nodes in human-SARS-CoV protein-protein interaction network
date: 2021-09-06
journal: PeerJ
DOI: 10.7717/peerj.12117
sha: c796d2649981609bb231c12d85e5864ba0225380
doc_id: 722764
cord_uid: 0pi6i2es

The entire world is witnessing the coronavirus pandemic (COVID-19), caused by a novel coronavirus (n-CoV) generally distinguished as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). SARS-CoV-2 promotes fatal chronic respiratory disease followed by multiple organ failure, ultimately putting an end to human life. International Committee on Taxonomy of Viruses (ICTV) has reached a consensus that SARS-CoV-2 is highly genetically similar (up to 89%) to the Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), which had an outbreak in 2003. With this hypothesis, current work focuses on identifying the spreader nodes in the SARS-CoV-human protein–protein interaction network (PPIN) to find possible lineage with the disease propagation pattern of the current pandemic. Various PPIN characteristics like edge ratio, neighborhood density, and node weight have been explored for defining a new feature spreadability index by which spreader proteins and protein–protein interaction (in the form of network edges) are identified. Top spreader nodes with a high spreadability index have been validated by Susceptible-Infected-Susceptible (SIS) disease model, first using a synthetic PPIN followed by a SARS-CoV-human PPIN. The ranked edges highlight the path of entire disease propagation from SARS-CoV to human PPIN (up to level-2 neighborhood). The developed network attribute, spreadability index, and the generated SIS model, compared with the other network centrality-based methodologies, perform better than the existing state-of-art.

The COVID-19 pandemic registered its first case on 31 December 2019 (World Health Organization, 2020b) . First, it laid its foundation in the Chinese city of Wuhan (Hubei province) . Soon, it made several countries worldwide (Centers for Disease Control and Prevention (CDC) , 2021) its victim by community spreading which ultimately compelled the World Health Organization (World Health Organization (WHO), 2019) to declare a global health emergency on 30 January 2020 (World Health Organization (WHO), 2005b) for the massive outbreak of COVID-19. Owing to its expected fatality rate, which is about 4%, as projected by WHO (World Health Organization (WHO), 2005a) , researchers from nations all over the world have joined their hands to work together to understand the spreading mechanisms of this virus SARS-CoV-2 (Heymann, 2020; Huang et al., 2020; Zhou et al., 2020) and to find out all possible ways to save human lives from the dark shadow of Coronavirus belongs to the family Coronaviridae. This single-stranded RNA virus affects not only humans but also mammals and birds too. Due to coronavirus, common fever/flu symptoms are noted in humans, followed by acute respiratory infections. Nevertheless, coronaviruses like Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS) can create a global pandemic due to their infectious nature. Both of these coronaviruses are the member of genus Betacoronavirus under Coronaviridae. SARS started a significant outbreak in 2003, originating from Southern China. Seven hundred seventy-four deaths were reported among 8098 globally registered cases resulting in an estimated fatality rate of 14%-15% (World Health Organization (WHO), 2003) . While MERS commenced in Saudi Arabia, creating an endemic in 2012. The world witnessed 858 deaths among 2494 registered positive cases. It generated a high fatality rate of 34.4% in comparison to SARS.

SARS-CoV-2 is under the same Betacoronavirus genus as that of MERS and SARS coronavirus (Lu et al., 2020) . It comprises several structural and non-structural proteins. The structural proteins include the envelope (E) protein, membrane (M) protein, nucleocapsid (N) protein, and the spike (S) protein. Though SARS-CoV-2 has been identified recently, there is an intense scarcity of data and necessary information needed to gain immunity against SARS-CoV-2. Studies have revealed that SARS-CoV-2 is highly genetically similar to SARS-CoV based on several experimental genomic analyses (Hoffmann et al., 2020; Letko, Marzi & Munster, 2020; Lu et al., 2020; Zhou et al., 2020) . This is also the reason behind the naming of SARS-CoV-2 by the International Committee on Taxonomy of Viruses (ICTV) (World Health Organization (WHO), 2020a) . Due to this genetic similarity, the immunological study of SARS-CoV may lead to the discovery of SARS-CoV-2 potential drug development.

A protein-protein Interaction Network (PPIN) has been used as the central component in identifying spreader nodes in SARS-CoV in the proposed methodology. PPIN is a very effective module for protein function determination (Cai, Wang & Deng, 2020; Hakala et al., 2020; Saha et al., 2019a; Saha et al., 2018; Saha et al., 2019b; as well as in the identification of central/essential spreader nodes in the PPIN (Anthonisse, 1971; He et al., 2021; Jeong et al., 2001; Joy et al., 2005; Li et al., 2011; Liu, Ma & Chen, 2019; Wen et al., 2020; Wuchty & Stadler, 2003; Zhong et al., 2021) . The compactness of the PPIN and its transmission capability is estimated using centrality analysis. Anthonisse (1971) proposed a new centrality measure named Betweenness Centrality (BC). Another centrality measure, called closeness centrality (CC), is defined by Sabidussi (1966) . Two other essential centrality measures: degree centrality (DC) (Jeong et al., 2001) and Local average centrality (LAC) (Li et al., 2011) , are also found to be very effective in this area of research.

Due to the high morbidity and mortality of SARS-CoV2, it has been felt that there is a pressing need to properly understand the way of viral infection transmission from SARS-CoV-2 PPIN to human PPIN. This paper considers SARS-CoV PPIN for this research study due to its high genetic similarity with SARS-CoV-2. Another primary motivation is to study the spreadability pattern of the ancestral strain of nCoV. In the proposed methodology, at first, SARS-CoV-Human PPIN (up to level-2) is formed from the collected datasets (Agrawal, Zitnik & Leskovec, 2017; Pfefferle et al., 2011) . Once created, the spreader nodes are first identified in the SARS-CoV PPIN. Then its level-1 and level-2 interactors in the human PPIN are extracted using a new network attribute, i.e., spreadability index, which is a combination of three different network features: (1) edge ratio (Samadi & Bouyer, 2019) (2) neighborhood density (Samadi & Bouyer, 2019) and (3) node weight (Wang & Wu, 2013) . The detected spreader nodes in the human PPIN are validated by the Susceptible, Infected, and Susceptible (SIS) epidemic disease model (Bailey, 1975) . Then the edges connecting two spreader nodes are ranked based on the average spreadability index. Thus, the ranked edges highlight the path through which viral infection gets mediated from SARS-CoV to human PPIN (up to level-2). The entire methodology can be categorized into 3-steps for (1) identifying the spreader nodes in the SARS-CoV and human PPIN using spreadability index, (2) validation of spreader nodes by SIS model, and (3) ranking of the spreader edges.

Developing the spreadability index for raking edges in a host-pathogen PPIN to analyse the host's viral infection propagation path is the primary contribution of this work. Furthermore, considering the current investigation on SARS-CoV and the notable similarity with its successor virus, we also attempt to shed light on the propagation pattern of viral infection of SARS-CoV2 in human PPIN.

In the following, we first describe the theory and methods for different network properties used to extract the PPIN characteristics. Then we describe the 3-step methodology. First, the methodology has been described using a synthetic PPIN (generated by Cytoscape; Shannon et al., 2003) . Then, in the experimental results section, we have employed the developed method on the human-SARS-CoV PPIN to identify the SARS-CoV viral infection propagation path in the human PPIN. Finally, in the discussion section, we attempt to relate our findings with the ancestral virus, i.e., SARS-CoV, with its successor, i.e., SARS-CoV2, to study the SARS-CoV2 disease propagation may follow the pattern from SARS-CoV.

The viral infection gets mediated from one part of the PPIN to another through spreader nodes and edges (Brito & Pinney, 2017) . Generally, in disease-specific PPIN models, at least two entities are involved: pathogen/Bait and host/Prey (Saha et al., 2017) . In this research work, SARS-CoV takes the role of the former while human the latter one. Viral proteins of SARS-CoV tend to target their corresponding interaction with human proteins, which target its next level of proteins. So, the establishment of interactions between SARS-CoV and human occurs through connected nodes and edges of PPIN. But mostly, these viral proteins try to interact more with the central/hub proteins rather than the other proteins (Brito & Pinney, 2017) . Thus, proper identification of central nodes (i.e., spreader nodes) is required. It is also confirmed that the interaction is not possible without the edges connecting two spreader nodes. Thus, these connecting edges are called spreader edges. The proposed methodology involves a proper study and assessment of various existing established PPIN features followed by identifying spreader nodes, which the SIS model has also verified. Before going into the detailed study about the proposed work, various network-based terminologies which are used in this work are discussed below:

When one protein interacts with another protein, it forms a network-like structure known as PPIN. Generally, it is portrayed as a graph where proteins are represented as nodes, and their corresponding connecting edges represent their interactions. Mathematically, PPIN can be highlighted as a graph G nv , which consists of a set of vertices v(nodes) connected by edges e (links). Thus, G nv = (v,e) (Saha et al., 2014; Saha et al., 2019a ).

In a PPIN, level-1 proteins of a node are those proteins that are in direct connection with that node, i.e., its immediate neighbors, whereas level-2 proteins are those proteins that are indirectly connected with level-1 proteins of that node, i.e., its indirect neighbors (Saha et al., 2014; Saha et al., 2019a) .

Graph centrality is one of the essential aspects for the identification of significant nodes in a PPIN. The centrality of a node defines how relevant the node is in a PPIN or how much a node is centrally located in a PPIN.

BC (Anthonisse, 1971 ) is one of the ways of measuring a node's impact on the transmission of information between every pair of nodes in a graph, considering that this transmission is always executed over the shortest path between them. Mathematically, it is defined as:

where ρ(s,t ) is the total number of shortest paths from node s to node t , and ρ(s,u,t ) is the number of those paths that pass through u.

CC (Sabidussi, 1966) is a procedure for detecting nodes that transmit information within a network efficiently. Nodes with high closeness centrality values are considered to have the shortest distance to all available nodes in the network. It can be mathematically expressed as:

where |N u | denotes the number of neighbors of node u and dist (u,v) is the distance of the shortest path from node u to node v.

DC (Jeong et al., 2001) is considered the simplest among the available centrality measures that only count the degree of a node, i.e., the number of directly connected neighbors. Nodes having a high degree are said to be the highly connected module of the network. It is defined as:

where |N u | denotes the number of neighbors of node u.

LAC (Li et al., 2011) of a node represents how close its neighborhood proteins are. It is defined to be the local metric to compute the essentiality of the node for transmission ability by considering its modular nature, the mathematical model of which is highlighted as:

, the number of neighbors of node u) and deg w c u isthe total number of nodes that are directly connected in C u .

Ego network of node i (S i ) (Samadi & Bouyer, 2019) is defined as the grouping of node i itself along with its corresponding level-1 neighbors and interconnections. N (S i ) (Samadi & Bouyer, 2019) consists of the set of nodes which belong to the ego network, S i i.e., {i} ∪ (i).

The edge ratio of node i (Samadi & Bouyer, 2019) is defined by the following equation:

where E S i out is the total number of interactions between the ego network S i and the proteins outside it. E S i in is the total number of interactions among node i's neighbors. (i) denotes the level-1 neighbors of node i.S i is considered to be Ego network. S i (j) denotes node j's neighbors which belongs S i . In the edge ratio, E S i out is positively related to the non-peripheral location of node i. A large number of interactions resulting from the ego network denotes that the node has a high level of interconnectivity between its neighbors. On the other hand, E S i in is negatively related to the inter-module location of node i. It represents the fact that the interconnectivity between neighbors is usually connected to the number of structural holes available around the node. Thus, when the neighbor's interconnectivity is low, the root or the central node i gains more control of transmission flow among the neighbors.

The similarity between two nodes is determined by Jaccard dissimilarity (Jaccard, 1912) based on their common neighbors. Jaccard dissimilarity of node i and j (dissimilarity(i,j)) is defined as:

where | (i) ∩ j | refers to the number of common neighbors of i and j.| (i) ∪ j | is the total number of neighbors of i and j. The similarity degree between i and j is considered more when they have more common neighbors. Whereas, when dissimilarity between the neighbors of a node is high, it guarantees that the only common node among the neighbors is the central node, which is termed a structural hole situation (Samadi & Bouyer, 2019) .

The neighborhood diversity (Samadi & Bouyer, 2019 ) is a significant parameter of a graph that is based on Jaccard dissimilarity. When the dissimilarity of the neighbors of a node is high, it assures that the central node is the only neighbor common among the neighbors of that node, i.e., it represents the structural hole situation. On the other hand, when a node's neighborhood diversity reaches its greatest value, it reveals that the neighbors have no other closer path. Hence, the neighbors should transmit or communicate through this node. Mathematically, it is defined as:

Node weight (Wang & Wu, 2013 ) is a graph parameter used to assign weightage to a node in a graph. Node weight w v of node v ∈ V in PPIN is interpreted as the average degree of all nodes in G V , a sub-graph of a graph G V . It is considered as another measure to determine the strength of connectivity of a node in a network. Mathematically, it is represented by are the randomly generated sample PPINs (nodes with edges) used for the detailed analysis and testing of the proposed methodology (for example, please see Fig. 1 ). The algorithm of the same is discussed in the supplementary document. Biological PPINs are the complete PPINs generated from the above datasets on which the proposed methodology is executed after testing (for example, please the complete PPIN view of SARS-CoV and human PPIN added at the end of the Experimental Results and Discussion section).

The proposed work can be mainly categorized into three sub-sections: (1) Identification of spreader nodes by spreadability index, (2) Validation of spreader nodes by SIS model, and (3) Ranking of spreader edges.

The spreadability index of node i is defined as the ability of node i to mediate a viral infection in a PPIN. Mathematically it can be defined as:

Nodes having a high spreadability index are termed as spreader nodes, i.e., if the viral proteins establish interactions with these nodes, then the viral infection can be mediated to a more significant number of nodes in a much short amount of time compared to the other nodes in PPIN. Figure 1 represents a sample PPIN where each protein is denoted as a node while edges mark its interactions with other proteins. The PPIN consists of 33 nodes and 53 edges. The PPIN data and the protein names and interactions are given as input to the Cytoscape, which generates the network view as highlighted in Fig. 1 . Cytoscape is open-source software that is used for PPIN generation and visualization (Shannon et al., 2003) . The spreadability index is computed on the synthetic PPIN, shown in Fig. 1 , using essential PPIN characteristics in this PPIN, as stated earlier. The same is compared to DC, BC, CC, and LAC, highlighted in Tables 1 to 5.

In Fig. 1 , it can be observed that nodes 1 and 24 are the essential spreaders. Node 1 connects the four densely connected modules of the PPIN, making this node the topper with the highest spreadability index. This node has been correctly ranked by all the methods except LAC and DC. Node 24, though, has a moderate edge ratio and node weight but is one of the most densely connected modules itself despite getting isolated from the main PPIN module of node 1. Moreover, node 24 has the highest neighborhood density. It establishes that the only path of transmission of information for nodes 26, 27, 25, 28, 29, 30, 31, 32 , and 33 is node 24. Thus, if viral proteins of SARS-CoV establishes interaction with node 24, then all the connected nodes will be indirectly coming under the interaction of viral proteins as the connected nodes have no interactions with other central nodes except node 24. So, node 24 holds the second position for the spreadability index in our proposed methodology. Node 24 is not correctly identified as the second most influential spreader node by the other methods. Further assessment of the remaining nodes highlights the fact that the performance of the new attribute spreadability index in our proposed methodology is relatively better in comparison to the others.

To design the mathematical model for this infectious disease, the SIS Epidemic Model (Bailey, 1975) is used in this proposed methodology by classifying the proteins in SARS-CoV-human PPIN based on their interactivity status (for more details, please see ''Studied Models in epidemiology'' section of the supplementary document). SIS refers to Susceptible, Infected and Susceptible states, which are generally considered the three probable protein states in a PPIN. (1) S -The susceptible states are the states of those human proteins with which viral proteins have not yet interacted, but they are at risk of getting interacted. In general, every protein in PPIN is initially in a susceptible state. (2) I -These infected states are the states of those human proteins with which viral proteins have interacted, and the Table 1 Computation of spreadability index of synthetic Fig. 1 and computation (3) S -The susceptible states are the states of those human proteins that have lost their interaction with the viral proteins (due to antiviral therapies or change in interface residues (Brito & Pinney, 2017) ) and again become susceptible. The interaction rate of the viral proteins with human proteins, the loss rate of interactivity of the human protein with the viral proteins (general assumption is that any protein after coming out of the infected state gets into a susceptible state again in one day), and the Table 1 ) in comparison to others for their corresponding top 10 spreader nodes in the synthetic PPIN, as shown in Fig. 1 . 

To show the ranking of interacting spreader edges, two synthetic PPINs: PPIN-1 and PPIN-2, have been considered in Fig. 2 . Node D, E, and F are the selected top spreader nodes in PPIN-1 by spreadability index, similarly explained with a synthetic PPIN in Fig. 1 .

To avoid the complexity in the diagram, the top 5 nodes in PPIN-2 (see Table 1 ) are selected as spreader nodes. Red-colored edges are the interconnectivity within PPIN-1, Table 5 Computation of DC of synthetic Fig. 1 and computation of spreadability rate of selected top 10 spreader nodes by the SIS model. edges are ranked based on the average spreadability index of its connected spreader nodes. The ranked spreader edges in Fig. 2 are highlighted in Table 6 .

The proposed methodology leads to the identification of spreader nodes and edges through a network characteristic, called spreader index which has also been checked and validated Fig. 3 . In Fig. 3A , at first, SARS-CoV PPIN is displayed in which each protein is marked in red. After that, spreader nodes in SARS-CoV PPIN are identified by the spreadability index. They are denoted as blue nodes among the red. Once the spreader nodes are active (Fig. 3B) , the viral infection gets mediated through its corresponding direct partners, i.e., human-level-1proteins (marked in deep green). Then, in Fig. 3C , spreader nodes are identified in SARS-CoV level-1 human proteins (marked in yellow). The same will continue to SARS-CoV level-2 human proteins (light green nodes are the spreaders). Fig. 4 , SARS-CoV PPIN has been highlighted. There are mainly nine proteins, including E, M, ORF3A, ORF7A, S, N, ORF8A, ORF8AB, and ORF8B. The computed spreadability index of these proteins and the corresponding validation by the SIS model are highlighted in Table 7 . It is also compared with other central/ influential spreader node detection methodologies like DC, CC, LAC, and BC, shown in Tables 8-11 . Similarly, spreader nodes are also identified in SARS-CoV's level-1 neighbors and level-2 neighbors (see Figs. 5 and 6).

The spreadability index plays a vital role in this proposed methodology. Spreader nodes are successfully identified by this scoring technique which covers all the aspects through which viral infection gets mediated from one node to another in a PPIN (Brito & Pinney, 2017) . It should be mentioned here that while identifying spreader nodes in SARS-CoV level-2 human proteins, it has been noted that the number of nodes is getting increased significantly with the increment of successive levels. So, high, medium, and low thresholds Table 12 . It can be observed that threshold application is only implemented at SARS-CoV level-2 human proteins, not on others. This is because of the availability of a smaller number of nodes and edges. Therefore, only nodes and edges having a shallow spreadability index have been discarded at the first level.

Besides the identification of spreader nodes, spreader edges are also identified. The ranked edges between SARS-CoV spreaders and its level-1 human spreaders are highlighted in Table 13 . In contrast, the ranked edges between SARS-CoV s level-1 and level-2 human spreaders at high, medium, and low thresholds are highlighted in the Tables S1-S3, respectively. The supplementary document is available online here. The complete PPIN -1 and level-2) . The PPIN consists of the interaction between SARS-CoV and human proteins. The blue node represents SARS-CoV spreaders, while the yellow and green nodes represent SARS-CoV s level-1 and level-2 human spreaders. The thickness of the edges varies with the order of ranking.

Full-size DOI: 10.7717/peerj.12117/ fig-6 https://yu2qkp7gwoinjwsebyw0xw-on.drv.tw/www.low_threshold.com/graph_low_ threshold.html.

In the above-generated PPIN views, the blue, yellow, and green colors represent SARS-CoV spreaders, level-1 human spreaders, and its level2 human spreaders. The remaining nodes are in indigo.

The spreadability index is thus proved to be effective in detecting spreader nodes and edges in SARS-CoV-human PPIN and the cross-validation by the SIS model. Spreader nodes are the central nodes in the PPIN through which viral infection gets mediated to their successors. Simultaneously, if the spreader nodes are not connected with spreader edges, that would not have been possible. In a nutshell, it can be said that the proposed work exploits the possibility of understanding how viral infection gets mediated from the SARS-CoV PPIN to the human PPIN. It should be borne in mind that SARS-CoV2 is ∼89% genetically similar to its predecessor SARS-CoV (Chan et al., 2020; CIDRAP, 2020) . Therefore, it strongly reveals that the human proteins chosen as spreaders of SARS-CoV might be the potential targets of SARS-CoV2. So, the same concept of the

Large-scale analysis of disease pathways in the human interactome

The rush in a directed graph

The mathematical theory of infectious diseases and its applications

BioSNAP: network datasets: human protein-protein interaction network

Protein-protein interactions in virus-host systems

SDN2GO: an integrated deep learning model for protein function prediction

Centers for Disease Control and Prevention (CDC). 2021. Available at

Genomic characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with atypical pneumonia after visiting Wuhan

China releases genetic data on new coronavirus, now deadly

Neural network and random forest models in protein function prediction

Method for identifying essential proteins by key features of proteins in a novel protein-domain network

Data sharing and outbreaks: best practice exemplified

The novel coronavirus 2019 (2019-nCoV) uses the SARS-coronavirus receptor ACE2 and the cellular protease TMPRSS2 for entry into target cells

Clinical features of patients infected with 2019 novel coronavirus in Wuhan

The distribution of the flora in the Alpine zone

Lethality and centrality in protein networks

High-betweenness proteins in the yeast protein interaction network

Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses

A local average connectivity-based method for identifying essential proteins from the network level

Identification of essential proteins by using complexes and biological information on dynamic PPI Network

Identifying influential spreaders based on edge ratio and neighborhood diversity measures in complex networks

Cytoscape: a software environment for integrated models of biomolecular interaction networks

A novel coronavirus outbreak of global health concern

Detecting overlapping protein complexes in PPI networks based on robustness

Essential proteins identification based on integrated network

Emergency Committee regarding the outbreak of novel coronavirus

detail/30-01-2020-statement-on-the-second-meeting-of-the-internationalhealth-regulations

Middle East respiratory syndrome coronavirus (MERS-CoV)

World Health Organization (WHO). 2020a. Naming the coronavirus disease (COVID-19) and the virus that causes it

World Health Organization. 2020b. World-Health-Organization Coronavirus disease (COVID-19) outbreak

A novel essential protein identification method based on PPI networks and gene expression data

A pneumonia outbreak associated with a new coronavirus of probable bat origin

The authors received support (infrastructure facilities) from the ''Center for Microprocessor Applications for Training Education and Research'' research laboratory of the Computer Science and Engineering Department, Jadavpur University, India. In addition, this project is also supported by the Department of Biotechnology project (No. BT/PR16356/BID/7/596/2016), Ministry of Science and Technology, Government of India. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

The PPIN consists of the interaction between SARS-CoV and human proteins. The blue node represents SARS-CoV spreaders, while the yellow node represents SARS-CoV s level-1 human spreaders. The thickness of the edges varies with the order of ranking.Full-size DOI: 10.7717/peerj.12117/fig-5 view of SARS-CoV and human PPIN has been generated online (by using the pyvis module available in python) under three circumstances:(1) All the nodes and edges are considered spreader nodes and edges respectively and ranked accordingly.https://yu2qkp7gwoinjwsebyw0xw-on.drv.tw/www.graph_all.html/graph_all.html.(2) Selected Spreader nodes and edges are highlighted for the high threshold. https://yu2qkp7gwoinjwsebyw0xw-on.drv.tw/www.high_threshold.com/graph_high_ threshold.html.(3) Selected Spreader nodes and edges are highlighted for the medium threshold. https://yu2qkp7gwoinjwsebyw0xw-on.drv.tw/www.medium_threshold.com/graph_ medium_threshold.html.(4) Selected Spreader nodes and edges are highlighted for the low threshold.Spreadability index is applied along with a unique fuzzy protein-protein interaction model to form SARS-CoV2-human PPIN in our other research work (Saha et al., 2020a) . The formed PPIN is also compared (Saha et al., 2020b) with that of SARS-CoV2-Human PPIN generated in the work of Gordon et al. (Gordon et al., 2020) . Henceforth, study and analysis of drug repurposing of COVID-19 are also implemented in the subsequent research work (Saha et al., 2020b) . Thus, it explores a new direction in identifying essential drugs/vaccines for SARS-CoV2. Recently, the work is limited to only SARS-CoV/SARS-CoV2, which can be further extended to other viral infectious diseases in our future work.

The following grant information was disclosed by the authors: Center for Microprocessor Applications for Training Education and Research. Department of Biotechnology Project: BT/PR16356/BID/7/596/2016). Ministry of Science and Technology, Government of India.

The authors declare there are no competing interests.

• Sovan Saha and Subhadip Basu conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.• Piyali Chatterjee and Mita Nasipuri conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

The following information was supplied regarding data availability:The data and code are available at GitHub: https://github.com/SovanSaha/Detectionof-spreader-nodes-in-Human-SARS-CoV-protein-protein-interaction-network.

Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.12117#supplemental-information.