key: cord-0005107-waubg27l authors: Xia, Lingling; Jiang, Guoping; Song, Yurong; Song, Bo title: An improved local immunization strategy for scale-free networks with a high degree of clustering date: 2017-01-11 journal: Eur Phys J B DOI: 10.1140/epjb/e2016-70334-9 sha: fe1c366703be7f9a6680ce72539645dc9fcb546a doc_id: 5107 cord_uid: waubg27l The design of immunization strategies is an extremely important issue for disease or computer virus control and prevention. In this paper, we propose an improved local immunization strategy based on node’s clustering which was seldom considered in the existing immunization strategies. The main aim of the proposed strategy is to iteratively immunize the node which has a high connectivity and a low clustering coefficient. To validate the effectiveness of our strategy, we compare it with two typical local immunization strategies on both real and artificial networks with a high degree of clustering. Simulations on these networks demonstrate that the performance of our strategy is superior to that of two typical strategies. The proposed strategy can be regarded as a compromise between computational complexity and immune effect, which can be widely applied in scale-free networks of high clustering, such as social network, technological networks and so on. In addition, this study provides useful hints for designing optimal immunization strategy for specific network. A series of studies indicate that many real-world complex networks in nature and society share two generic properties: they are scale-free and exhibit a high degree of clustering [1] . These real-world networks include metabolic networks, the World-Wide-Web (WWW), the protein interaction networks, and even some social networks [2] . Pastor-Satorras and Vespignani [3] found the absence of epidemic threshold in scale-free networks, where virus can spread even when its spreading rate is very small. It implies that the methods by decreasing spreading rate cannot succeed in eradicating the epidemic in heterogeneous populations [4, 5] , but the immunization strategies by vaccinating the nodes can be effective in suppressing the epidemic and have been paid much more attention in recent years. The best known strategy for heterogeneous networks is believed to be targeted immunization [6] [7] [8] [9] [10] , such as high degree targeted (HD), high degree adaptive (HDA), high betweenness (HB) targeted strategies and so on. The basic idea of targeted strategies is first to rank the importance of nodes and then remove the nodes from highest importance to lowest until the network becomes disconnected. From the viewpoint of the spreading, Schneider et al. [11] introduced a novel inverse targeting strategy where the importance of nodes is represented by their a e-mail: jianggp@njupt.edu.cn contribution to the disease spreading. This strategy is much more efficient than the HDA strategy and is also as efficient as the HB targeted strategy. Using explosive percolation (EP) paradigm, Clusella et al. [12] proposed explosive immunization (EI) for immunization and targeted destruction of networks. Since in many cases the number of immunization doses is limited or very expensive, the equal-graph-partitioning (EGP) strategy [13] was proposed to fragment a given network with a minimum number of node removals. Indeed, finding the optimal set of influencers is important in both the immunization of networks and the destruction of networks by targeted attacks. Morone and Makse [14] investigated the influence maximization in complex networks through optimal percolation and presented collective influence (CI) algorithm to identify a new class of strategic influencers which outrank the hubs in the network. They compared the results obtained by CI and belief propagation-guided decimation (BPD) [15] algorithm on a single random scale-free network and found "evidence of the best performance of CI". Recently a systematic comparative study on the CI and the BPD algorithm was performed by Mugisha and Zhou (the author of BPD) and a slightly adjusted version of the BPD algorithm was applied to the network optimal attack problem [16] . The results of comparison demonstrate that the improved BPD has much better performance than the CI for different types of random networks and real-world networks. Building on the statistical mechanics perspective, Braunstein et al. [17] proposed a very efficient three-stage min-sum (MS) algorithm for solving the dismantling problem. Most of the above strategies require global information about network topology and exhibit clear advantage compared to existing heuristic strategies (such as HDA and HD). However, the global information is always difficult to gather and calculate for large scale networks. That is one reason why local immunization strategy is widely used in practice. For instance, acquaintance immunization [18] requires no knowledge of the node degrees or any other global information and is effective for any broad-scale distributed network. As for acquaintance immunization, the higher-connected nodes are immunized in the scale-free networks. Gallos et al. [19] introduced a purely local immunization strategy which is practically as efficient as the targeted immunization strategy of highest degree nodes. Their strategy consists of selecting a random node and then asking for a neighbor who has more links than himself or more than a given threshold k cut and this neighbor is immunized. However, the performance of this new strategy mainly depends on the selection of given threshold k cut which requires extensive calculations. Generally, many optimal immunization strategies [18] [19] [20] [21] on the heterogeneous networks retain the advantage of being purely local and their basic idea is to immunize the nodes with a higher influence. Thus, analyzing the topological characteristics of networks and identifying the most influential nodes in spreading dynamics is of utmost importance to prevent and control epidemics. Centola [22] investigated the spread of health behavior through artificially structured online communities and found that the behavior spreads farther and faster across clustered-lattice networks than across corresponding random networks. It was clear that his experimental results revealed the importance of clustering. From the analysis of the growth of Facebook, Ugander et al. [23] found that the probability of contagion is tightly controlled by the number of connected components in an individual's contact neighborhood, rather than by the actual size of the neighborhood. On the basis of empirical observation, Chen et al. [24] proposed a local ranking method, named ClusterRank, to identify influential nodes in directed networks by taking into account not only the number of neighbors and the neighbors' influences but also the clustering coefficient. In order to identify important people who are linked by strong social ties within an individual's network neighborhood, Backstrom and Kleinberg [25] developed a new measure of tie strength that they term "dispersion" characterizing the extent to which two people's mutual friends are not themselves well-connected. This measure of dispersion involves not only the number of mutual friends of two people, but also the network structure on these mutual friends [25] . Moreover, considering both the number and sizes of communities that are directly linked by a node, Zhao and others [26] introduced a new centrality index to identify influential spreaders in a network with the community structure. Above studies [22] [23] [24] [25] [26] motivate us that the node with a high connectivity and a low clustering coefficient (that is, the node has a widespread and dispersive social circle) plays an important role in the network. In the case of disease spreading, immunizing or isolating the influential individuals can decrease the impact of disease outbreaks. Though the clustering coefficient [27] is not a most efficient measure for finding the most influential nodes, it is beneficial to assess the importance of nodes in scale-free networks of high clustering. Most previous studies of immunization just identified the nodes of high connectivity due to the hierarchical infection [7] , while the nodal clustering coefficient was seldom considered. Motivated by the importance of clustering [22] [23] [24] [25] [26] we propose an improved local immunization strategy which takes into account the nodal degree and clustering coefficient. Our strategy consists of randomly selecting initial individual and then picking out his acquaintance (friend) who has many friends and a dispersive friend distribution, and the selected acquaintance is immunized. This process is iterated through choosing the immunized acquaintance as the initial input in turn until the number of immunized individuals is reached. The rest of the paper is arranged as follows. We analyze the importance of clustering coefficient and present an improved immunization strategy in Section 2. In Section 3, the Monte Carlo simulations are performed on both real and artificial networks, and the advantages of the proposed strategy are pointed out by comparing with two typical strategies. The conclusions are given in Section 4. The clustering coefficient was introduced by Watts and Strogatz [27] in the context of social networks analysis. Let G = (V, E) be an undirected, simple (no self-loops, no multiple edges) graph (network) with a set of nodes V and a set of edges E. The total number of nodes is n = |V |, and the total number of edges is m = |E|. Suppose that a node i has k i neighbors, and at most k i (k i − 1)/2 edges can exist among these neighbors when every neighbor of node i is connected to every other neighbor of i. The clustering coefficient c i of node i denotes the fraction of these allowable edges that actually exist. The clustering coefficient C(G) of a graph G is the average over the clustering coefficients of its nodes. For social networks such as friendship networks, the clustering coefficient has intuitive meanings. The c i reflects the extent to which friends of i are also friends of each other, and the C(G) measures the cliquishness of a typical friendship circle. When the degrees of nodes are the same, the node with a lower clustering coefficient has a more dispersive social circle. In of node b has a particularly close connection. In Figure 1b , the node i and the node j both have six neighbors, but the friends of node i are not very familiar with each other. The extent to which friends of j are also friends of each other is much bigger than that of node i, and in other words, node i has a more dispersive social circle than node j. In general, the node with a low clustering coefficient has a dispersive neighbor distribution. According to the above analysis, we can see that the nodes with lower clustering coefficient have more dispersive social circle, and these nodes should be paid more attention on the immunization. Thus, combining the topological properties of network with the covering strategy [20, 21] we propose a local strategy called Dispersionbased immunization for scale-free networks with a high degree of clustering. Our strategy can be described as follows. Dispersion-based immunization: select a random node i as the source node. Then find one of its neighbors j which has the lowest clustering coefficient among a certain number of the highly connected neighbors (who has a great number of friends and a dispersive social circle) and immunize it. This certain number is closely related to the average degree of network and its optimal value k /2 is obtained through many tests. The nodal degree and clustering coefficient are calculated on the basis of the original network G = (V, E). When the node with lowest clustering coefficient has been immunized in the previous steps, we can choose the next node with second lowest clustering coefficient as the new immunization target. Moreover, when all the k /2 highly connected neighbors have been immunized, we can randomly choose another node from the original network as the new source. According to this way, we will stop until the percentage g of immunized nodes is reached. To make the immunization strategies comparable under the same condition of using only local topological knowledge, our strategy is compared to only two typical local strategies, and they are briefly introduced as follows. Degree-based immunization (proposed by Moreno and co-workers [20, 21] ): select a random node i as the source node. Then find one of its neighbors j and immunize it which has the highest degree (who has the greatest number of friends) among all the neighbors of node i. In case there is more than one node with the highest degree, one of them is selected at random and immunized. When the node with highest degree has been immunized in the previous steps, we can choose the next node with second highest degree as the new immunization target. Moreover, when all the neighbors have been immunized, we can randomly choose another node from the original network as the new source. According to this way, we will stop until the percentage g of immunized nodes is reached. ClusterRank-based immunization (proposed by Chen et al. [24] ): select a random node i as the source node. Then find one of its neighbors j and immunize it which has the highest ClusterRank score s j among all the scores of i's neighbors. In case there is more than one node with the highest ClusterRank score, one of them is selected at random and immunized. In the calculation of ClusterRank score, the degree and clustering coefficient are calculated on the basis of the original network G = (V, E). When the node with highest ClusterRank score has been immunized in the previous steps, we can choose the next node with second highest score as the new immunization target. Moreover, when all the neighbors have been immunized, we can randomly choose another node from the original network as the new source. According to this way, we will stop until the percentage g of immunized nodes is reached. Mathematically, the ClusterRank score s i of node i is given by [24] : where Γ i is the set of neighbors of node i and k j is the degree of node j (j ∈ Γ i ). This index is used to quantify the influence of a node by taking into account not only its direct influence (measured by its clustering coefficient and the number of its neighbors) but also the influences of its neighbors. In the later simulations, we plug f (c i ) = 10 −ci ∈ [0.1, 1] into ClusterRank score s i . Remark. The proposed Dispersion-based immunization is different from the ClusterRank-based immunization [24] in two aspects. Firstly, in the detailed process of searching for immunization targets, they are different. In our algorithm, we first choose k /2 highly connected neighbors of source node i, and then find the final immunization target which has the lowest clustering coefficient among the k /2 highly connected neighbors. This selection process first considers the node's degree and then considers the clustering, because in the scale-free networks, the degree centrality of node has extremely remarkable difference and can be regarded as the first consideration. However, in the ClusterRank-based immunization [24] , the degree and clustering coefficient are considered simultaneously in the computation of ClusterRank score s i = f (c i ) · j∈Γi k j . Secondly, our algorithm does not need multiplication operator and avoids the dependence upon the setting of clustering function f (c i ). We compare our strategy with two typical immunization strategies by Monte Carlo simulations on eight networks. These networks used in this study include six artificial scale-free networks with a high degree of clustering [28] and two real networks including GR-QC (General Relativity and Quantum Cosmology) collaboration subnet [29] and Email subnet [30] . ArXiv GR-QC collaboration network is from the e-print ArXiv and covers scientific collaborations between authors of papers submitted to General Relativity and Quantum Cosmology category [29] . For the Email network, we focus on the largest component with 1133 users (nodes) from Universitat Rovira i Virgili (Tarragona) [30] . The basic topological properties of networks are shown in Table 1 . N and L are the total numbers of nodes and links, respectively. k and d denote average degree and average path length, respectively. C is the clustering coefficient [27] of whole network, where the clustering coefficient of the node whose degree is one in a network is set to zero. The degree distributions of these networks are shown in Figure 2 where the scale-free feature is obvious except Email subnet. Extensive Monte Carlo simulations are carried out simulating the effectiveness of the immunization strategies in conjunction with the standard SIS epidemiological model on top of those underlying networks. In the SIS model, each susceptible (healthy) node is infected with rate ν if it is connected to an infected node. Infected nodes are cured and become again susceptible with rate δ, defining an effective spreading rate λ = ν/δ (without lack of generality, we set δ = 1). All these simulations are conducted at a fixed spreading rate λ = 0.25, and each immunization strategy is implemented via selecting and immunizing gN nodes on a network of fixed size N . Initially we choose a random node as an infected spreader in the network, and iterate the rules of the SIS model with parallel updating. Under the same conditions, we analyze the behavior of the reduced prevalence ρ g /ρ 0 for increasing immunization densities g, where ρ g is the stationary properties of the density of infected nodes and ρ 0 is the prevalence without immunization. In principle, the global targeted immunization algorithm produces the same set of immune nodes during each run of algorithm, but the local immunization algorithm is likely to produce different one at each run. Thus, to avoid randomness of local immunization and spreading process, we run each local immunization algorithm 20 times, and the spreading results are averaged over 500 realizations corresponding to different initially infected nodes for each run of immunization algorithm. In Figure 3 , we show the behavior of the reduced prevalence ρ g /ρ 0 as a function of the immunization g and present immunization threshold g c for the described immunization strategies applied to eight different networks. For artificial networks (see Figs. 3a-3f) and the real GR-QC collaboration subnet (see Fig. 3g ), all plots almost follow similar patterns that from top to bottom the immunization strategies are, in order, the ClusterRank-based immunization, the Degree-based immunization, and the Dispersion-based immunization, indicating that these local strategies based on the neighbor information of each individual are effective for scale-free networks with a high degree of clustering. In Figures 3a-3g , we can see that our Dispersion-based immunization strategy with utilization of the clustering coefficient and nodal degree is slightly better than the Degree-based immunization only based on the degree of each neighbor. With the increasing of clustering coefficient C of network, the advantage of our strategy becomes more and more obvious, illustrating the importance of nodal clustering coefficient in the immunization of scale-free networks of high clustering. It can be understood that the superior performance of our strategy is based on that its calculations during iterations are a little more than that of the Degree-based immunization strategy with only signal information (degree) taken into account. Although the ClusterRank-based immunization strategy also uses the local information about clustering coefficient and degree, it is clearly seen from the critical immunization threshold g c (as the arrows denoted in the Figs. 3a and 3b) that our strategy works much better than the ClusterRank-based immunization does and obtains satisfactory effect. In the steps of ClusterRank-based immunization, the ClusterRank score of each i's neighbor needs to be computed and sorted. However, in our strategy, only k /2 clustering coefficients of neighbors need to be computed and sorted. Compared with the ClusterRank-based immunization, our strategy has obvious advantages of both less computations and better immune effect, so it can be widely used in large-scale and dynamic-evolving networks (e.g. P2P networks and Social networks). Since both of the above strategies (our strategy and the ClusterRank-based immunization strategy) use clustering coefficient and nodal degree, we'll discuss why the immune effect of ClusterRank-based immunization is worse than that of our strategy. As can be seen in equation (1), the calculation of ClusterRank score s i depends on clustering function f (c i ), which reveals that the performance of ClusterRank-based immunization is closely related to the design of function f (c i ). Thus, the nondeterminacy of construction of the function f (c i ) affects the performance of ClusterRank-based immunization. Given the above, our strategy with utilization of the clustering coefficient and nodal degree requires small computing burden compared to the ClusterRank-based immunization and the performance of our strategy is superior to that of two typical strategies (the ClusterRank-based immunization and the Degree-based immunization), which make our strategy efficiently achieve a trade-off between computational complexity and immune effect. The simulation results reveal that the nodes with lower clustering coefficient have more dispersive social circle and cannot be ignored in the immunization strategies. However, for homogeneous Email subnet, the curves of these three strategies (see Fig. 3h ) overlap almost completely, which shows that these strategies have no significant advantages in homogenous network. It is further proved that all these local immunization strategies are suitable for scale-free networks of high clustering. Actually all these immunization strategies have essentially the same ultimate operations of removing edges, that is, cut the path through which most of the susceptible nodes catch the epidemics [31] . In Figure 4 , we depict the reduced edges E(g) for the described immunization strategies applied to the eight networks above. The results in Figures 4a-4f indicate that our strategy leads to largest E(g) compared with two typical immunization strategies. Based on the results shown in Figures 3a-3f and 4a-4f, we can clearly see that the more the number of reduced edges in an immunization procedure is, the better its capacity of controlling the spreading of diseases will be. This conclusion can be used as guidance for developing immunization strategies of suppressing the transmission of contagious diseases (e.g. SARS, H1N1, and MERS) in the human contact networks. However, in Figures 3g and 4g, the simulations on real GR-QC subnet show that our strategy not only costs least number of reduced edges E(g) but also obtains best immune effect. This finding in real network may be a special case but can give us new perspectives on empirical research of immunization strategies. Nowadays, in order to guarantee the transmission capacity of specific networks, such as computer networks, power networks, online social networks, and transportation networks, the less the number of reduced edges E(g) in an immunization procedure is, the better its ability of carrying information will be. Therefore, we can employ different immunization strategies to adapt different scenarios, such as different types of networks and different networks sizes. From this angle, our strategy can also be viewed as a compromise. The immunization of scale-free networks with a high degree of clustering has been investigated in this paper, and an improved local immunization strategy has been proposed. The main aim of the proposed strategy is to iteratively immunize the node with a high degree and a low clustering coefficient. Our strategy is local not only because the decision for immunization of a given node is taken with only requiring the connectivity of subnet composed of node itself and its neighbors, but also because the immunization strategy is completed through an iterative process until the percentage g of immunized nodes is reached meaning that only a certain number of subnet are involved. The effectiveness of our strategy has been verified through comparing with two typical local immunization strategies on both real-world and scale-free networks with a high level of clustering. In terms of application, the proposed strategy regarded as a compromise between computational complexity and immune effect can be widely applied in social network, technological networks, such as Internet and WWW, where the number of links and the connected user (or hyperlink) categories for a given node are exactly known to the network administrator. L.X., G.J. and Y.S. designed research, L.X. and B.S. performed simulations and plotted the figures, L.X., G.J. and Y.S. analyzed the simulation results, L.X. and G.J. wrote the manuscript. All authors approved the manuscript. Infectious diseases in humans: dynamics and control Proc. Natl. Acad. Sci Proceedings of the National Academy of Sciences of the United States of America Proceedings of the Seventeenth ACM Conference on Computer Supported Cooperative Work and Social Computing