key: cord-0000602-v5surcw0 authors: Hadidjojo, Jeremy; Cheong, Siew Ann title: Equal Graph Partitioning on Estimated Infection Network as an Effective Epidemic Mitigation Measure date: 2011-07-22 journal: PLoS One DOI: 10.1371/journal.pone.0022124 sha: 23acad3bdcee7563f22da647aae639dbee1d56e6 doc_id: 602 cord_uid: v5surcw0 Controlling severe outbreaks remains the most important problem in infectious disease area. With time, this problem will only become more severe as population density in urban centers grows. Social interactions play a very important role in determining how infectious diseases spread, and organization of people along social lines gives rise to non-spatial networks in which the infections spread. Infection networks are different for diseases with different transmission modes, but are likely to be identical or highly similar for diseases that spread the same way. Hence, infection networks estimated from common infections can be useful to contain epidemics of a more severe disease with the same transmission mode. Here we present a proof-of-concept study demonstrating the effectiveness of epidemic mitigation based on such estimated infection networks. We first generate artificial social networks of different sizes and average degrees, but with roughly the same clustering characteristic. We then start SIR epidemics on these networks, censor the simulated incidences, and use them to reconstruct the infection network. We then efficiently fragment the estimated network by removing the smallest number of nodes identified by a graph partitioning algorithm. Finally, we demonstrate the effectiveness of this targeted strategy, by comparing it against traditional untargeted strategies, in slowing down and reducing the size of advancing epidemics. Understanding and containing the spread of an infectious disease has always attracted a lot of interest from the scientific community, and even more so after the recent SARS and H1N1 outbreaks. Besides their social and healthcare impacts, severe infectious disease outbreaks also present an important burden to the economy through the decrease in productivity and high cost of treatment. As denser populations promote faster spreading, this problem can only grow in severity and magnitude with the increasing world population density. Motivated by this, many works have been done to predict [1, 2] or even contain [3] [4] [5] [6] the spread of severe epidemics. In spite of these efforts, effective control of infectious disease outbreaks continues to elude us. A recent important advancement in the field is the application of network theory to study epidemic dynamics. Network-based models have been shown able to accurately explain complex phenomena in terms of the relatively-simple interactions between its small constituents, and are therefore broadly applicable to various different fields [7] . In the area of infectious diseases, numerous studies have been done on modeling epidemic using a network approach -mainly covering sexually-transmitted infections [8] [9] [10] [11] , respiratory and flu-like diseases [2] , and general features of infectious disease dynamics [12] [13] [14] [15] [16] [17] . From the network point of view, we speculate that different infectious diseases might have very similar infection networks if they share the same mode of transmission. In particular, we believe that severe respiratory infections such as H1N1 and SARS share an infection network similar to that of the less severe common cold. Hence, infection network inferred from the latter can be useful in controlling rare outbreaks of the former. Estimating the infection network from common infections has the advantage of a large volume of daily incidences. As we will show in the Results section, the volume of incidences data gathered is critical in getting accurate estimation of the network. Traditional epidemic intervention procedures, such as quarantine and other social distancing measures, involve weakening or cutting links around the infected nodes. However, these procedures are not systematic from the network point of view. A more effective intervention strategy would employ understanding the 'shape' of the infection network and applying that knowledge to efficiently tear the network apart. This can be done by targeting nodes that play important roles in the network (i.e. the 'hubs'), and the 'backbones' connecting one hub to another. Based on the above ideas, we proposed a targeted method to effectively contain infectious disease epidemics. This strategy involves estimating the infection network of a severe disease using incidences data from common infections sharing the same infection network, and fragmenting the network into disconnected pieces using a graph partitioning method. To test the proposed strategy in principle, we first generate artificial social networks of different sizes but with roughly the same clustering characteristics to serve as our infection networks. Then we simulate multiple SIR epidemics on the networks to play the role of common infections circulating in society. We apply censorships on the incidence data collected to emulate the low reporting rates of common infections, and use the censored incidences to construct estimates of the original infection networks. To mitigate new epidemics, we fragment the estimated infection networks. To do this, we apply a graph partitioning method on the estimated networks to identify the smallest sets of nodes that, when removed, will efficiently break the networks up into isolated pieces. Finally, we evaluate the effectiveness of this targeted strategy by comparing against traditional untargeted methods. While each of the problems has been studied independently (see for example [8, 18] for network reconstruction and [19] [20] [21] [22] for graph partitioning), to the best of our knowledge no work has been done applying both methods to control epidemics. At this point in time, we know of no available databases of common infections that (1) are comprehensive enough for reconstructing the infection network and (2) have relevant ongoing epidemics to test the proposed strategy. Hence, we used computer simulations to study the proposed method. Many naturally-occurring networks like the Internet, the World Wide Web, and biological networks are scale-free and are thus well described by Barabasi's preferential attachment model [23] . Others argued, however, that social networks are different as they show strong clustering of nodes (also called community structures) and their degree distributions are not power laws [24] . To specifically reproduce the community structures seen in social and social-like networks, Holme and Kim modified the preferential attachment model to incorporate clustering [25] . Newman et al., on the other hand, started out with random graphs and progressively build the social-like degree distribution [26] , whereas Boguna et al. and Jin et al. proposed friendship-formation dynamics models to generate social-like networks from scratch [27, 28] . For our proof-of-concept study, we generate social-like networks to act as our infection networks. We follow the three intuitive rules described by Jin, Girvan, and Newman (JGN) in Ref. [28] : (1) the probability of two individuals meeting is high if they have one or more mutual friends, and low otherwise; (2) the friendship between two individuals is reinforced by regular meetings, but decays with time if they rarely meet; and (3) there is a maximum number of friends one can have. Following the discrete algorithm presented in Ref. [28] , we first start with a random network of N nodes and m links. The average degree of the network (average number of links per node) is given by SkT~2 m N . At each time step, we select n m pairs of nodes that have at least one mutual friend. This is done by first picking n m intermediate nodes at random, before choosing two neighbours of each to become the pairs. In addition to this, we choose n r other pairs of nodes uniformly at random, with n m .n r . For every pair that is not already connected, we form a new link between them provided that both nodes have not reached the maximum number of friends L. At the end of each time step, we randomly break r existing links to simulate the friendship decay. We then calculate the clustering coefficient c of the network using the method by Schank and Wagner [29] . After the clustering coefficient stops increasing and only fluctuates about a time-independent long-run average, we say the network has converged and stop the simulation. As expected, the clustering coefficients of the converged networks are much higher than c R & SkT N for random networks with the same N and ,k.. In generating the infection networks, we set L = 50 as the maximum number of friends a node can have. To produce high clustering coefficients, we ensure that the mutual friend formation is dominant over the random friend formation by choosing n m = 400 and n r = 100. For simplicity, we choose r = n r +n m = 500 such that the total number of links (hence the average degree ,k.) of the initial random network remains more or less constant. This way, we can generate social-like infection networks with arbitrary size and average degree by simply adjusting N and ,k. of the initial random network to the desired values. After generating the infection network, we simulate S susceptible-infected-recovered (SIR) epidemics. To start the epidemic, we initialize the network so that all nodes are susceptible. One random node is then infected to act as seed of the epidemic. At each time step, every infected node will transmit the disease to its susceptible neighbours with probability q. After a certain time interval t R , the infected nodes will recover and become immune to subsequent infections. This way, the number of infected nodes grows from the single seed, peaks, and thereafter decreases as more and more nodes recover and become immune. When there are no more infected nodes in the network, the epidemic ends and all nodes are reset to the susceptible state for simulating the next epidemic. In this SIR model, the probability q of infecting susceptible neighbours reflects the characteristic infection rate of the particular disease over the simulation time step Dt. For a given time step size Dt, a more infectious disease will have a larger q, whereas a less infectious disease will have a smaller q. For a given infectious disease, q will be smaller for a smaller Dt, and larger for a larger Dt. The simulation time step Dt itself is chosen, for simplicity, to be roughly equal to the typical incubation and recovery period of the disease (about 3-5 days for common cold). This implies that t R