key: cord-0427029-f613tjt1
authors: Jhun, Bukyoung
title: Effective epidemic containment strategy in hypergraphs
date: 2021-08-10
journal: nan
DOI: 10.1103/physrevresearch.3.033282
sha: 87af39454a2eb63e96b3b12213ebd2f64c5a4baf
doc_id: 427029
cord_uid: f613tjt1

Recently, hypergraphs have attracted considerable interest from the research community as a generalization of networks capable of encoding higher-order interactions, which commonly appear in both natural and social systems. Epidemic dynamics in hypergraphs has been studied by using the simplicial susceptible-infected-susceptible ($s$-SIS) model; however, the efficient immunization strategy for epidemics in hypergraphs is not studied despite the importance of the topic in mathematical epidemiology. Here, we propose an immunization strategy that immunizes hyperedges with high simultaneous infection probability (SIP). This strategy can be implemented in general hypergraphs. We also generalize the edge epidemic importance (EI)-based immunization strategy, which is the state of the art in complex networks. However, it does not perform as well as the SIP-based method in hypergraphs despite its high computational cost. We also show that immunizing hyperedges with high H-eigenscore effectively contains the epidemics in uniform hypergraphs. A high SIP of a hyperedge suggests that the hyperedge is a"hotspot"of the epidemic process. Therefore, SIP can be used as a centrality measure to quantify a hyperedge's influence on higher-order dynamics in general hypergraphs. The effectiveness of the immunization strategies suggests the necessity of scientific, data-driven, systematic policy-making for epidemic containment.

In the past two decades, extensive research has been devoted to spreading processes in complex networks [1] [2] [3] [4] [5] to model the spread of epidemic diseases [6] and innovations [7, 8] , opinion formation [9] [10] [11] [12] , and many other physical and social phenomena [13] [14] [15] [16] . Researchers now have access to large-scale datasets of interactions, such as mobility, collaborations, and temporal contacts that were unavailable in the past [17] [18] [19] , and complex network representations of interactions enable the researchers to effectively study various dynamical processes. The large body of research devoted to spreading processes on complex networks provided quantitative analysis for policy-making especially in the public-health domain. Furthermore, the epidemic processes provide deeper understanding of critical phenomena and phase transition behaviors, such as the effect of structural heterogeneity on the transition point [20, 21] and discontinuous phase transitions induced by cascade dynamics [22] [23] [24] .

A hypergraph is a generalization of network that can describe higher-order interactions between more than two agents, which widely appear in both natural and social systems, that networks cannot [25] [26] [27] [28] . A hypergraph consists of nodes and hyperedges, and a hyperedge of size d connects d nodes simultaneously. The hyperedges of a hypergraph can have various sizes, but if all the hyperedges in a hypergraph have the same size d, it is called a duniform hypergraph. In a collaboration hypergraph [29, 30] , for instance, a hyperedge of size d encodes a d-author paper, and the nodes of the hyperedge encodes the authors of the paper. Hypergraphs have been used to describe neural and biological interactions [31, 32] , evolutionary dynamics [33, 34] , and other dynamical processes [35] [36] [37] [38] . Recently, the simplicial susceptible-infected-susceptible (s-SIS) model [39] * jhunbk@snu.ac.kr was introduced to describe higher-order epidemic process in hypergraphs. The model has attracted extensive interest from the research community due to its simplicity and novel phase transition behavior [40] [41] [42] [43] [44] .

An important topic in epidemiology is immunization, and it has been studied for various epidemic models in complex networks [45] [46] [47] [48] [49] [50] [51] [52] [53] . If a node in the network is immunized, the node cannot turn into the infected state, and if an edge is immunized, the infection does not spread through the immunized edge. Edge immunization models epidemic containment measures such as travel regulation and social distancing. If a node or edge is immunized, it does not only prevent nodes directly connected to them from being infected. If a portion of nodes or edges greater than a threshold p c is immunized, the epidemic state in the network vanishes. This effect is called herd immunity, and the threshold is called the herd immunity threshold (HIT). The objective of an efficient immunization strategy is to achieve herd immunity by immunizing a minimal portion of nodes or edges, i.e. minimizing HIT p c . Such strategies can be used to vaccinate people with limited resources or prevent a pandemic by minimally regulating air traffic or social gatherings. The same theory can be used to promote spreading processes. If the spreading process models information flow, for instance, the objective is usually to optimize the spreading of information in a system. In such cases, we buttress the nodes or edges targeted by the efficient immunization strategies instead of immunizing them. Alternatively, in a reverse point of view, an adversarial attack can be made on such nodes/edges to hamper the information flow in the system. However, the efficient immunization strategy for epidemic processes in hypergraph has not been studied, despite the topic's importance in mathematical epidemiology.

Here, we propose an immunization strategy that targets hyperedges with high simultaneous infection probability (SIP), which is the probability that all the nodes in a hyperedge are in the infected state. This probability is calculated by the individual-based mean-field (IBMF) theory [54, 55] . This strategy can be implemented to contain epidemics of s-SIS model in general hypergraphs. We also show that immunizing hyperedges with the highest Heigenscores, which is defined as the product of the elements of the H-eigenvector of the adjacency tensor with the largest H-eigenvalue of all the nodes in the hyperedge, effectively achieves herd immunity in uniform hypergraphs. This method generalizes the edge eigenscore in a complex network and can be implemented to contain epidemics in uniform hypergraphs. However, this method cannot be implemented in arbitrary hypergraphs with various hyperedge sizes. We also generalize the EI-based method [51] , which is the state-of-theart immunization strategy for complex networks. However, we find that this method does not perform as efficiently as Heigenscore and SIP-based strategies for hypergraphs despite its higher computational cost. If a hyperedge has a high SIP, it suggests that the hyperedge is a 'hotspot' of the epidemic process. Therefore, SIP can be used as a centrality measure to quantify a hyperedge's influence on higher-order dynamics in general hypergraphs. The effectiveness of the immunization strategies suggests the necessity of quantitative and systematic policies for epidemic containment measures.

This paper is organized as follows: First, we introduce the epidemic model in a hypergraph in Sec. II A. Next, we introduce the hypergraph static model in Sec. II B and the hypergraph popularity-similarity optimization (h-PSO) model in Sec. II C. We show that the h-PSO model generates a hypergraph with a power-law degree distribution and a tunable clustering coefficient. In Sec. III, we extend the individualbased mean-field (IBMF) and pair-based mean-field (PBMF) theories to general hypergraphs, which is required for the immunization strategies. The immunization strategies are illustrated in Sec. IV. The strategies' performance in complex networks and hypergraphs are tested in Sec. V. A summary and the final remarks are presented in Sec. VI.

A contagion process through dyadic interaction (represented by a network or a metapopulation model) is called a simple contagion process. The SIS model is one of the most extensively studied simple contagion model in complex networks along with the susceptible-infectedrecovered (SIR) model [6] . In the SIS model, each node in the network is in the either susceptible (S) or infected (I) state. If a node is infected, it turns into the susceptible state with a constant recovery rate µ. If a node is susceptible, it is infected with infection rate β from each of its infected neighbors. The rates of contagion and infection only depend on the current configuration {X 1 , X 2 , . . . , X N }, where X i is the state of node i (X i ∈ {S, I}), of the epidemic states and not on the past configurations, i.e. they are treated as Poisson processes. If the infection rate is higher than a certain value (i.e., epidemic threshold), the system can reach a stationary state, allowing several theoretical approaches [51, 56, 57] .

Many contagion phenomena that cannot be reduced to a simple contagion process have been observed, especially in social systems [58] [59] [60] [61] . More complicated models of contagion, namely complex contagion processes including the threshold model and generalized epidemic model have been proposed. Among them, the recently introduced s-SIS model was introduced as a complex contagion model and has attracted extensive interest due to its simplicity, analytic tractability, and novel critical phenomena [39] [40] [41] [42] .

In the model, the contagion occurs through hyperedges in hypergraphs, which have attracted considerable interest from the research community as a generalization of networks due to their capability of encoding higher-order interactions between more than two agents [25] [26] [27] [28] . The model is illustrated in Fig. 1 . In the s-SIS model, a node in the hypergraph is in a susceptible or an infected state, as in the traditional SIS model. If a susceptible node has a hyperedge of size d where all the other d − 1 nodes in the hyperedge is in the infected state, the node is changed to the infected state with rate β d . Here, we study the discrete-time version of the model where time t is integer. If a susceptible node at time t has n hyperedges that satisfy the contagion condition, each hyperedge has a probability β d to turn the susceptible node to the infected state at time t + 1. Also, an infected node at time t turns into a susceptible node at time t + 1 with probability µ.

Many real-world interactions, whether dyadic or highorder, exhibit high heterogeneity characterized by power-law behavior. To model such highly heterogeneous hypergraphs, the hypergraph static model [40] was introduced as a hypergraph model with a degree (number of hyperedges connected to the node) distribution with a power-law tail, namely a scale-free hypergraph. It is a generalization of the static model of complex networks [62, 63] , and has been used as a canonical method to generate scale-free networks due to its simplicity and analytical tractability [64] [65] [66] [67] . In the hypergraph static model, (i) Parameter p i is assigned to each node in the hypergraph.

This parameter controls the nodes' fitness to have a high degree.

(ii) Pick d nodes with probability p i 1 · · · p i d . If a hyperedge {i 1 , · · · , i d } is not already present in the hypergraph, add it to the hypergraph.

(iii) Repeat step (ii) until the number of hyperedges reaches NK.

j −α and 0 < α < 1 (hence, p i = 1 and 0 < p i < 1), we obtain a hypergraph with power-law degree distribution. Because the probability of each node being chosen in step (ii) is independent and identically distributed (i.i.d), node i is chosen with probability p i in each iteration, hence each node i has expected degree k i , and the distribution of the expected degree P d ( k i ) ∼ k i −γ with γ = 1 + 1/α. The minimum degree k m = N 1−α k / N j=1 j −α converges to a finite value γ−2 γ−1 k , and the maximum degree k max = N k / N j=1 j −α diverges in the thermodynamic limit N → ∞. Thus, we obtain a scale-free network with mean degree k = dK and degree exponent γ. We obtain an Erdős-Rényi-type random hypergraph in the γ → ∞ limit.

In addition to a highly heterogeneous degree distribution, agents in many real-world systems have a higher chance of being connected if they are similar. The similarity of two nodes is characterized by their closeness in their latent coordinates. The objective of graph node embedding algorithms [68] [69] [70] is to discover the latent coordinates of a network. For instance, hub airports are connected to a disproportionately large number of airports around the world (heterogeneous degree distribution), but two small airports can be connected by an airline if they are geographically close. Also, two researchers who are not particularly prolific can coauthor a paper if they are close. This effect is called homophily and results in non-vanishing clustering coefficients in both networks and hypergraphs. To account for such phenomena, a hypergraph model with a scale-free degree distribution and tunable non-vanishing clustering coefficient needs to be introduced. Furthermore, the immunization strategies need to be tested in clustered hypergraphs because it is known that epidemic dynamics and the performance of immunization strategies differ in clustered and unclustered networks [51] .

The clustering coefficient C(H) of a hypergraph H is defined as follows [71] :

where a hypertriangle is a set of three distinct nodes v 1 , v 2 , v 3 and three distinct hyperedges

The clustering coefficient can be greater than 1 in hypergraphs because a undirected 2-path can have multiple closures. If there are only size-2 hyperedges in the hypergraph (i.e., if the hypergraph is a network), C 2 becomes the transitivity coefficient [72] , which is widely used in social network analysis. Note that there is another definition of the clustering coefficient C (i) d that generalizes the local clustering coefficient of graphs [33] .

To generate a scale-free hypergraph with a nonvanishing clustering coefficient, we introduce the hypergraph popularity-similarity optimization model (h-PSO), which is a hypergraph version of the popularity-similarity optimization (PSO) model in complex networks [68, 73] . The d-uniform h-PSO model is generated as follows:

(i) Popularity parameter p i is assigned to each node in the hypergraph. If a node has a high p i , the node tends to have a high degree.

(ii) Latent coordinate x i is assigned to each node in the hypergraph. If two nodes i and j are close in the latent coordinate (i.e., x j − x i is small) two nodes will likely be connected by hyperedges.

(iii) Pick a node i with probability p i .

If a hyperedge {i, j 1 · · · , j d−1 } is not already present in the hypergraph, add it to the hypergraph.

(v) Repeat steps (iii)-(iv) until the number of hyperedges reaches NK.

Here, we choose the latent coordinates on a ring; the latent coordinates are randomly chosen without replacement from θ ∈ {1, 2, · · · , N}, and the distance between two nodes i and j is defined as min

the resulting hypergraph is a scale-free hypergraph with degree exponent γ = 1 + 1/α. The clustering coefficient can be controlled by the scale parameter R and the temperature T ; if R and T are large, the clustering coefficient is small. The degree distribution and the clustering coefficient of the h-PSO model with hyperedge size 3 are illustrated in Fig. 2 . The degree distribution has a powerlaw tail with exponent γ, and the clustering coefficient can be controlled by adjusting R.

In this section, we explain the individual-based mean-field (IBMF) theory and pair-based mean-field [51] (PBMF) theory for hypergraphs, which are used in immunization strategies. IBMF tracks the probability of infection p i of each node in the network. By ignoring the statistical correlation of the probability between two nodes [P(X i , X j ) = P(X i )P(X j ), where X i , X j ∈ {S , I}], the IBMF equation for the SIS model can be expressed as (2) where N(i) is the set of nodes connected to node i (nearestneighbors of i). For continuous phase transitions, where p i vanishes in the vicinity of the phase transition, the equation can be linearized as p i (t+1) = j βa i j + (1 − µ)δ i j p j and the epidemic threshold β µ is the inverse of the largest eigenvalue of the adjacency matrix a i j . Because IBMF ignores the positive correlations of the state (neighbors of infected node have greater chance of being in the infected state) in the actual system, it tends to overestimate the density of infection. The theory can be straightforwardly extended to the s-SIS model:

IBMF is often employed to describe the dynamics and phase transitions in classical stochastic processes [54, 55] , as well as driven-dissipative quantum dynamics [74] . The method predicts the properties of the epidemic states more accurately than homogeneous mean-field theory or degree-based meanfield theory [20] , which is often referred to as heterogeneous mean-field theory. PBMF, often referred to as an epidemic-link equation, is known to predict the properties of the epidemic states more precisely than IBMF. In PBMF, we track the probability of the infection p i (t) of each node the same as for IBMF, and for pairs of nodes (i, j) that are connected in the network we set the differential equations for the probability that both of the nodes are infected as ψ i j (t) = P(X i = I, X j = I). Probabilities for other cases for a node pair P(X i = S , X j = S ), P(X i = S , X j = I), and P(X i = I, X j = S ) can be expressed in terms of the p i and ψ i j :

This method exploits the sparsity of the network (the number of variables and the equations in this method is proportional to the number of the nodes in the system); hence, it is scalable to large networks. Then, the equations for the nodes are expressed as

and the equations for the pairs are expressed as

where

q i (t) is the probability that node i is not infected during time step t → t + 1 given that the node i is not infected at time t; q i j (t) is the probability that the node i is not infected by a neighbor other than j during the time step t → t + 1 given that the node i is not infected at time t. Stationary states of the s-SIS model have been studied using both IBMF and PBMF in hypergraphs with hyperedges with sizes less than or equal to 3 [41] . Implementing the PBMF on general hypergraphs with arbitrary hyperedge sizes, the equations for the nodes are, again,

and the equations for the pairs that are connected by hyperedges are

where

where P X 1 ···X d r 1 ···r d is the probability that nodes r 1 · · · r d are each in state X 1 · · · X d . q i (t) represents the same probability in the network PBMF, q i j (t) is the probability that the node i is not infected by hyperedges that do not contain node j during time step t → t + 1 given that node i is not infected at time t, and u i j (t) is the probability that node i is not infected by hyperedges that contain node j during time step t → t + 1 given that node i is not infected at time t. We have used the following equation for closure:

For d ≤ 3, we recover the identity in Ref. [41] .

An immunization strategy is defined as a specific rule that determines a set of nodes or edges that will be immunized to eliminate the epidemic from the network. Immunized nodes cannot be infected and the infection cannot spread along the immunized edges. The immunization of nodes/edges does not only protect the nodes directly connected to them. When a sufficiently large fraction p > p c of the nodes/edges are immune, the system cannot maintain the epidemic state with a non-vanishing density of infection. This effect is called herd immunity. The objective of an immunization strategy is to find an algorithm that minimizes p c . Efficient immunization strategy that can be implemented in complex network has been extensively studied for both SIS [48] [49] [50] [51] and SIR [45] [46] [47] model. However, efficient immunization strategy for epidemics in hypergraphs has not been studied, despite the importance of the subject. Here, we develop a simultaneous infection probability (SIP)-based immunization strategy that can be used to efficiently eliminate epidemic states by immunizing edges in networks or hyperedges in hypergraphs. The strategy immunizes the edges or hyperedges in the descending order of the SIP, which is the probability that all the nodes in the edge/hyperedge are infected at the same time. The probability is calculated by IBMF in both networks and hypergraphs. In networks, the strategy is as efficient as the EI-based strategy [51] , which is the state-of-the-art immunization strategy, while incurring a lower computational cost. This method can be implemented in general nonuniform hypergraphs. We compare the efficiency of the strategy with several other methods in networks, uniform hypergraphs, and nonuniform hypergraphs. However, only the proposed SIP-based strategy can be efficiently implemented in general nonuniform hypergraphs.

The EI of an edge is defined as I i j = g i j + g ji , where

The probabilities are calculated by means of PBMF; βP X i = S , X j = I is the probability that the infection spreads from j to i along the edge (i, j), and r∈N(i) βP (X r = S |X i = I) quantifies the impact of such an event. For the s-SIS model in hypergraphs, the epidemic importance is expressed as

where S ({i 1 , · · · , i d }) is the set of all the permutations of the set {i 1 , · · · , i d }, and

ℓ=1 P X j 1 = I, · · · , X j ℓ −1 = I, X j ℓ = S , X j ℓ +1 = I, · · · , X j d ′ −1 = I|X i = I .

It was shown that immunizing edges with high EI efficiently eliminates epidemic states in various synthetic and empirical networks. Because we use PBMF, as the size of hyperedge d increases, the number of pairs whose probability should be tracked by ψ i j rapidly increases, and the computational cost of the method diverges. The eigenscore [50] , which is widely used as a centrality measure, of a node i is the element of the largest eigenvector e i of the adjacency matrix, and the eigenscore of an edge (i, j) is the product of the eigenscores of the two nodes of the edge e i e j . By immunizing the edges with the highest eigenscore, the spectral radius of the network is effectively reduced, and the epidemics in the network can efficiently be contained. The eigenscore-based strategy can be generalized for implementation in hypergraphs; however, there are multiple types of eigenvectors and eigenvalues in a uniform hypergraph. We find that the H-eigenvector is more suitable than the Z-eigenvector [75, 76] for s-SIS dynamics. The H-eigenvector e i of a d-uniform hypergraph is defined as a vector that satisfies

where a is the hypergraph adjacency tensor. We define the Heigenscore of the hyperedge {i 1 , · · · , i d } as the product of the elements of the H-eigenvector with the largest H-eigenvalue: e i 1 · · · e i d . For networks where d = 2, the H-eigenscore becomes the traditional eigenscore. Because the adjacency tensor is symmetric and hence diagonalizable [77] , the Heigenvector with the largest H-eigenvalue can be computed by an iterative power method:

Then, e (m) converges to the H-eigenvector with the largest H-eigenvalue as m → ∞. We show that removing high H-eigenscore hyperedges leads to effective epidemic containment in uniform hypergraphs.

However, for nonuniform hypergraphs, the adjacency tensor is not defined, and the method cannot be implemented in general nonuniform hypergraphs.

We introduce SIP as a measure of a hyperedge's contribution to the continuation of epidemics in the hypergraph. The SIP of a size-d hyperedge {r 1 , . . . , r d } is the probability that all nodes in the hyperedge are infected, which is calculated by the IBMF P I···I r 1 ···r d ≃ P I r 1 · · · P I r d .

Each infection probability P I r ℓ can be numerically calculated by solving Eq. (2) for its fixed point. Because this method uses IBMF, it incurs less computational cost than the EIbased strategy. This measure can be calculated in arbitrary nonuniform hypergraphs whose hyperedges have various sizes. We test the strategies in Sec. V.

Other centrality measures have been tested for immunization strategies; however, they were found to be inefficient. Immunizing high edge-betweenness edges is ineffective, sometimes less efficient than randomly immunizing edges [51] . The node-infectivity-based method has been tested as well, but it is not as efficient as the eigenscore or EI-based methods.

To test the immunization strategies, we implement the quasistationary method [78, 79] , which is a standard simulation method used to study stationary states of stochastic processes with absorbing states. An absorbing state has zero probability of transitioning to other states. In this case, because both the contagion and recovery process involves an infected node, if all the nodes are in the susceptible state, it cannot turn into any other state: it is the absorbing state of the s-SIS model. The quasistationary method constrains the system in the active states. We keep track of a set of configurations of the system, which is referred to as the history. With a certain probability, we replace one of the configurations in the history, randomly selected at each time step with the current state of the system. When the absorbing state is reached, the state of the system is replaced by a configuration randomly selected from the history. Here, we track 50 configurations and update with probability 0.2 at each time step.

We first test the strategies in synthetic and empirical networks. These networks are selected as examples, and the relative effectiveness of the immunization strategies generally do not strongly vary from network to network. For the unclustered scale-free network, we use the static model [62, 63] with 5000 nodes, 15000 edges, and degree exponent 3. For the clustered scale-free network, we implement the model proposed in Ref. [80] with 5000 nodes, 15000 edges, degree exponent 3, and the parameter p = 0.8, which makes the clustering coefficient 0.6. For empirical networks, we use the largest connected component of the airline network [17] which has 3354 nodes and 19162 edges. Each node represents an airport, and if there exists an airline between two airports, they are connected by an edge in the network. Another empirical network we use is the largest connected component of the general relativity and quantum cosmology collaboration network [18] . There are 4158 nodes and 13428 edges in the network. Each node represents an author of a paper submitted to the General Relativity and Cosmology category in arXiv, and if two authors coauthored a paper in the arXiv category from January 1993 to April 2003, they are connected by an edge in the network.

The results of the strategies in the networks are illustrated in Fig. 3 . We plot the density of infection versus the immunization rate p for β = µ = 0.2 [ Figs. 3(a-d) ]. The density of infection of efficient strategies is often higher than that of random edge immunization for small immunization rate p, but for sufficiently large p, the density of infection drops quickly and achieves herd immunity at a lower p c . The HIT p c is illustrated in Figs. 3(e-h). One way to calculate the effective HIT of is to calculate the minimally required immunization rate to lower the density of infection below 1/N. However, when simulating the stationary states of epidemic processes, if the system reaches its absorbing state, we arbitrarily adjust the system by reverting it back to one of its histories (quasistationary method) or activating a single site [81] . Therefore, the state whose number of infected nodes is close to zero is highly influenced by the choice of the simulation method which is not part of the epidemic model. To solve this problem, one can choose the herd immunity condition as the density of infection of 1%, which is sometimes used as a threshold to be considered as subextensive in networks [82] . However, in real-world situations, an epidemic prevalence of 1% is still an alerting scenario, and the epidemics cannot be considered under control. By choosing the density of infection of min 0.01, 1/ √ N as the herd immunity condition, this dilemma can be resolved. In the thermodynamics limit N → ∞, the epidemic density of the herd immunity condition converges to zero while the number of the infected nodes approaches infinity.

The recovery rate is fixed to µ = 0.2. The HITs of the three efficient strategies are almost identical. To compare the HITs of efficient strategies more thoroughly, we plot the differences between the eigenscore strategy and two other strategies in Figs. 3(i-l) . The EI-based strategy is generally slightly more efficient than the eigenscore strategy, but it does not have an advantage over the SIP-based strategy, despite its higher computational cost. Rather, the SIP-based strategy has a small advantage in the networks studied here, although the The efficient strategies generally exhibit a higher density of infection for small p, but herd immunity is achieved at lower p c , which is the minimally required portion of hyperedges that needs to be immunized to eliminate epidemics. (c, d) HIT p c as a function of contagion rate β = β 3 . The H-ES and SIP-based strategies outperform the EI-based strategy, despite their lower computational cost.

differences are marginal.

Then, we test the strategies in 3-uniform hypergraphs. We use two synthetic models of 3-uniform hypergraphs: a static model with 2000 nodes, 4000 hyperedges, and degree exponent 3, and the h-PSO model introduced in Sec. II C with the same number of nodes, hyperedges, and degree exponent. The temperature T = 0.5 and R = 1 result in the clustering coefficient C(H) = 1.0430. We illustrated the results in Fig. 4 . The density of infection ρ of the strategies versus the immunization ratio p for β = β 3 = µ = 0.2 is depicted in Figs. 4(a, b) . The H-eigenscore and SIP-based method result in a higher density of infection for small immunization ratios, but eventually yield a smaller HIT p c for herd immunity [ Figs. 4(c, d) ]. The recovery rate is fixed to µ = 0.2.

We test the SIP-based strategy in two empirical hypergraphs with various hyperedge sizes. One is the congressional bill cosponsorship hypergraph [19, 30] , which has 536 nodes and 2773 hyperedges whose mean size is 16.57 and maximum size is 323. Each node represents a US congressperson, and if a set of d congresspeople cosponsored a bill in the year 2000, they are connected by a hyperedge of size d. The other is the protein interaction hypergraph [32] , which has 8243 nodes and 6688 hyperedges whose mean size is 10.12 and maximum size is 421. Each node in the hypergraph represents a protein, and each hyperedge represents a type of multiprotein complex. Due to the large The density of infection ρ versus the removed portion of the edges p. The recovery rate µ = 0.2 and the contagion rate β = β d = 0.2 for all hyperedge sizes d. The efficient strategies generally exhibit a higher density of infection for small p, but herd immunity is achieved at lower HIT p c , which is the minimally required portion of hyperedges that need to be immunized to eliminate the epidemics. (c, d) HIT p c as a function of the contagion rate β = β d . Efficient epidemic containment is achieved by the SIP-based method with low computational cost. (e, f) The immunization rate of hyperedges with size d plotted for various contagion rates. Small hyperedges are primarily targeted by the immunization strategy especially when β is low. and heterogeneous size of hyperedges, only the SIP-based strategy can efficiently be implemented in these systems. We compare the density of infection of the strategy with random immunization in Figs. 5(a, b) . The recovery rate µ = 0.2 and the contagion rate for hyperedges are set β d = β = 0.2 independently of their sizes. While random immunization requires the majority of hyperedges to be immune to eliminate the epidemics, the SIP-based strategy achieves it with small p c . The HITs are plotted for various contagion rates β = β d in Figs. 5(c, d) . The immunization rate of hyperedges of each size are illustrated in Fig. 3 (e, f). Although removing large hyperedges affect large number of nodes, small hyperedges are primarily immunized especially when the contagion rate β is low. This is because the nodes that are connected by a small hyperedge interact more strongly. It is interesting to point out that an epidemic containment strategy that immunizes groups in descending order of their size was effective in the localized regime [36] of higher-order epidemics [83] .

In summary, we proposed an effective immunization strategy that immunizes hyperedges with high SIP that can be used in general hypergraphs, including networks. Hyperedges with high SIP are "hotspots" of the epidemics, and they can be identified and immunized. In case of information spreading processes, such hyperedges can be fostered to boost the information flow in the system. We also show that Heigenscore is a natural generalization of the eigenscore for hypergraphs. If all the hyperedges in a hypergraph have a size of 2, the H-eigenscore becomes identical to the eigenscore used in networks. Immunizing hyperedges with a high Heigenscore effectively contains the epidemics, but the method can only be implemented in uniform hypergraphs.

We tested the performance of the method and compared it with the state-of-the-art immunization strategy of the EI-based method in networks and hypergraphs. In networks, the HIT p c of the SIP-based strategy is marginally smaller than that of the EI-based strategy, despite its lower computational cost. In hypergraphs, the SIP-based strategy yields significantly smaller HIT p c with lower computational cost. This suggests that SIP can serve as a centrality measure for hyperedges in general hypergraphs. The large disparity between the p c of an efficient immunization strategy and random immunization calls for scientific, data-driven, systematic policy-making for containment measures to eliminate epidemics with the minimum use of resources for vaccination and minimal regulation of air traffic and social gatherings.

The IBMF used to calculate the SIP tend to overestimate the infection probability of the nodes (and, as a consequence, overestimate the global prevalence) because it ignores the correlations between the neighboring nodes.

Recently introduced microscopic epidemic clique equations (MECLE) [84] , which generalizes the epidemic-link equation to higher-order group interactions, predicted the density of infection and epidemic thresholds by taking the dynamic correlations between the neighboring nodes into account. An interesting work for the future might be to see how the performance of the SIP-based immunization strategy would be affected if the dynamical correlations are considered. Accounting for such correlations rapidly becomes unfeasible as the size of the hyperedges grow, therefore, it should be studied in hypergraphs whose hyperedges are not too large.

Critical phenomena in complex networks

Coevolution of dynamical states and interactions in dynamic networks

Evolution of networks

Nonequilibrium transitions in complex networks: A model of social interaction

Theory of rumour spreading in complex social networks

Epidemic processes in complex networks

Network Effects and Personal Influences: The Diffusion of an Online Social Network

A Prospective and Retrospective Look at the Diffusion Model

Opinion Fluctuations and Disagreement in Social Networks

Ising-based model of opinion formation in a complex network of interpersonal interactions

Influentials, Networks, and Public Opinion Formation

Non-Markovian majority-vote model

Network-based research in entrepreneurship

The Structure and Function of Complex Networks

Complex networks: Structure and dynamics

Disease and information spreading at different speeds in multiplex networks

Random Spatial Network Models for Core-Periphery Structure

Graph evolution

Connecting the Congress: A Study of Cosponsorship Networks

Epidemic spreading in scale-free networks

Epidemic outbreaks in complex heterogeneous networks

Mixed-Order Phase Transition in a One-Dimensional Model

Mixed-order phase transition in a two-step contagion model with a single infectious seed

Universal mechanism for hybrid percolation transitions

From networks to optimal higher-order models of complex systems

Networks beyond pairwise interactions: Structure and dynamics

Messagepassing approach to epidemic tracing and mitigation with apps

Centrality measures in simplicial complexes: Applications of topological data analysis to network science

The shape of collaborations

Simplicial closure and higher-order link prediction

Homological scaffolds of brain functional networks

Evolution of Cooperation in the Presence of Higher-Order Interactions: From Networks to Hypergraphs

Evolutionary dynamics of higherorder interactions in social networks

Random walks on hypergraphs

Master equation analysis of mesoscopic localization in contagion dynamics on higher-order networks

Phase transitions and stability of dynamical processes on hypergraphs

Homological percolation transitions in growing simplicial complexes

Simplicial models of social contagion

Simplicial SIS model in scalefree uniform hypergraph

Abrupt phase transition of epidemic spreading in simplicial complexes

The effect of heterogeneity on hypergraph contagion models

Simplicial SIRS epidemic models with nonlinear incidence rates

Efficient immunization strategies for computer networks and populations

Immunization and epidemic dynamics in complex networks

Finding a Better Immunization Strategy

Immunization of networks with community structure

Immunization of complex networks

Decreasing the spectral radius of a graph by link removals

Effective approach to epidemic containment using link equations in complex networks

Nonmassive immunization to contain spreading on complex networks

Optimal Allocation of the Limited COVID-19 Vaccine Supply in South Korea

Epidemic spreading in real networks: an eigenvalue viewpoint

Discrete-time Markov chain approach to contactbased disease spreading in complex networks

Epidemic dynamics and endemic states in complex networks

Analytical Computation of the Epidemic Threshold on Temporal Networks

Universal Behavior of Load Distribution in Scale-Free Networks

Intrinsic degreecorrelations in the static model of scale-free networks

Skeleton and Fractal Scaling in Complex Networks

Two order parameters for the Kuramoto model on complex networks

Recent Advances of Percolation Theory in Complex Networks

Enhanced storage capacity with errors in scale-free Hopfield neural networks: An analytical study

Network Mapping by Replaying Hyperbolic Growth

Proc. 20th ACM SIGKDD Int. Conf. Knowl. Discov. data Min

Node2vec: Scalable feature learning for networks

Rodríguez-Velázquez, Subgraph centrality and clustering in complex hyper-networks

Matrix measures for transitivity and balance*

Popularity versus similarity in growing networks

Epidemic Dynamics in Open Quantum Spin Systems

Singular values and eigenvalues of tensors: a variational approach

Eigenvalues of a real supersymmetric tensor

Symmetric Tensors and Symmetric Tensor Rank

Quasistationary analysis of the contact process on annealed scale-free networks

Epidemic thresholds of the susceptible-infected-susceptible model on networks: A comparison of numerical and theoretical results

Growing scale-free networks with tunable clustering

Tricritical directed percolation with longrange interaction in one and two dimensions

Dismantling efficiency and network fractality

Social Confinement and Mesoscopic Localization of Epidemics on Networks

Network clique cover approximation to analyze complex contagions through group interactions

This research was supported by the NRF, Grant No. NRF-2014R1A3A2069005.