1 Introduction

Epidemics are among the most prominent and fundamental processes that take place in networks since they represent a diverse set of phenomena, e.g., the spread of information in social media or biological virus among the global human population [4, 14, 15]. In a network epidemic, nodes have an epidemic state (infected or susceptible) that may change over time while contagion occurs through network edges. It is not surprising that, over the past decades, many works have focused on developing models to predict real network epidemics sources for various kinds of phenomena. More importantly, the focus has also been on understanding the impact that different measures have on the epidemic process, such as quarantine and immunization, to cite a recent example largely debated during the coronavirus pandemic [7].

An epidemic starts with the infection of a single node or a few nodes, known as the epidemic sources. From this initial set of nodes, the epidemic unfolds through the edges of the network potentially reaching a large fraction of the network nodes. Precise information about epidemic process is often not available, such as timing or contagion information. Therefore, a fundamental problem in network epidemic is identifying the set of source nodes after the epidemic has reached a significant fraction of the network [11, 20]. This problem is known to be hard for simple epidemics even when it starts on a single node and the network is infinite [18, 19].

The difficulty in identifying the source nodes strongly depend on the observation process, i.e., what is observed from the network epidemic. A common snapshot model consists of a single and instantaneous snapshot that reveals the epidemic state of all network nodes [11, 20]. If the snapshot is taken very early in the epidemic process, very few nodes will be infected and identifying the source nodes is intuitively easier. On the other hand, if it is taken very late, i.e., once all nodes have been infected, there is no information that can be used to identify the source nodes. The time the snapshot is taken can be translated to the moment the epidemic reaches a certain fraction of nodes, which is the model used in this paper.

The source identification problem can be posed as a classification problem as, given an epidemic snapshot, every node is to be classified as being an epidemic source, or not [3, 17]. Besides the epidemic state of the nodes provided in the snapshot, the network structure is also available to the classification model. Therefore, a natural classification model is one that takes advantage of the network over which the epidemic unfolds. Graph neural networks (GNNs) seem well-suited for this task since this model integrates neighboring information. However, two nodes that have too many neighbors in common are likely to have similar representations under classic GNN models which would not be suited for source identification, since neighbors of the source node are often not a source (depending on how source nodes are chosen).

In order to tackle this issue, this work leverages the epidemic information within the k-hop neighborhood of the node. Two metrics are proposed: the fraction of nodes infected at the k-hop neighborhood and the average infection rate at the k-hop neighborhood (to be discussed in detail). These metrics are a function of the epidemic snapshot and are computed for every network node efficiently. The information, i.e., vector of rational numbers for each metric, is taken as node attribute, distinguishing neighboring nodes much beyond their epidemic state and hence providing better inputs to the GNN model.

The remainder of this paper is organized as follows. In Sect. 2, network epidemic and multiple source detection problem are introduced. A brief discussion of the related work is presented in Sect. 3. In Sect. 4, the proposed framework is presented. The evaluation of the proposed framework under different scenarios is shown in Sect. 5. Finally, Sect. 6 presents a brief conclusion for the paper.

2 Network Epidemics and Problem Statement

Network epidemics reflect different phenomena, such as influence models that describe “social contagion” [2] among individuals of a offline or online social network, and infection models that describe the contagion of a biological disease among a population. While social and biological epidemics have fundamentally different underlying contagion mechanisms, their essence can be represented by simple epidemic models.

The simplest epidemic model is the compartmental model where each individual is in one of only two possible epidemic states: susceptible or infected. This model is known as the classic SI epidemic model [12] where individuals can only change from the susceptible to infected. In the context of networks, individuals are represented by network nodes, and edges indicate the possibility of contagion. Thus, let \(G=(V,E)\) denotes a undirected network (graph) where V and E denote the set of nodes and edges, respectively, and \(n= |V|\) denotes the number of nodes.

The SI network epidemic model here considered is a discrete time model. Let I(t) and S(t) denote the set of infected and susceptible nodes at time t. Note that \(I(t) \bigcup S(t) = V\) and \(I(t) \bigcap S(t) = \emptyset \) for all \(t=0,1,\ldots \). Let \(I_0 \subset V\) denote the set of epidemic sources, namely, the set of nodes that are infected at time zero, thus, \(I(0) = I_0\). At each time slot, an infected node infects each of its susceptible neighbors with probability p. Note that infection fails with probability \(1-p\), but the infected node is able to infect the same susceptible neighbor with probability p on subsequent time slots.

A susceptible node that has j infected neighbors in the current time step becomes infected in the next time step with probability \(1-(1-p)^j\). Nodes become infected in the same time slot and never recover, causing I(t) to be non-decreasing with t. Last, assuming the network is connected, it can be shown that there exists a finite large enough \(t^*\) such that \(I(t^*) = V\) with high probability, and thus all nodes are infected.

As \(I_0\) are the nodes chosen uniformly at random when the epidemic starts, \(s = |I_0|\) is the parameter that determines the number of epidemic sources.

2.1 Epidemic Observation

The snapshot observation model which reveals the sets \(I(t_o)\) and \(S(t_o)\) is the network observation at a given time \(t_o\). Note that G (the network) is also assumed to be known, and it is sufficient to observe I(t), since \(S(t) = V \setminus I(t)\).

The epidemic snapshot is taken when a predefined number of nodes have been infected. Let \(f_o\) denote that fraction of infected the nodes: the epidemic snapshot takes place the first time when \(I(t)/n \ge f_o\). More precisely, \(t_o = \min _t \{ t | I(t)/n \ge f_o\}\). In what follows, \(f_o\) will be used to determine when the snapshot is taken.

2.2 Problem Statement

Consider a SI network epidemic with parameter p on the network \(G=(V,E)\) having started with the epidemic sources \(I_0\). A single epidemic snapshot is taken when a fraction of nodes infected is \(f_o\). Design a model M that classifies nodes in \(I(t_o)\) as epidemic sources, taking as input G and \(I(t_o)\). Ideally, the model should recover \(I_0\), namely

$$\begin{aligned} M_{G,I(t_o)}(v) = {\left\{ \begin{array}{ll} 1 \text{, } \text{ if } v \in I_0 \\ 0 \text{, } \text{ otherwise } \end{array}\right. } \end{aligned}$$
(1)

Note that the model M has no prior information about p, the parameter for the SI epidemic. M is trained on different networks and epidemics and it should classify nodes for networks and epidemics never seen before. In fact, obtaining such generalization is the main challenge in this problem. In addition, in most scenarios the number of epidemic sources is very small with respect to the number of nodes infected in the snapshot, often by a factor of one hundred or more, hence, this classification task deals with very imbalanced classes.

3 Related Work

For over a decade, efforts have been made to address the problem of identifying the source of a network epidemic [11, 20]. The pioneering methods considered single source epidemics and simple networks while providing efficient algorithms to identify the source [19]. These approaches were based on simple probabilistic epidemic models and likelihood functions given the observation [5, 10, 23]. The rumor centrality algorithm proposed by Shah and Zaman is a prominent example that has been widely explored and extended [18, 19].

More recent works have tackled the multiple source detection (MSD) problem, where the epidemic simultaneously starts in different network nodes motivated by applications where the diffusion process starts in various nodes, e.g., information in an online social network, [3, 17, 21, 24]. NetSleuth [17] is an approach to infer the multiple epidemic sources in the probabilistic SI infection model using a likelihood function approach and Minimum Description Length (MDL) scores. NetSleuth requires prior knowledge of the number of epidemic sources. Another approach is LPSI (Label Propagation based Source Identification) [21] where labels assigned to nodes are propagated in iterations to build a weight matrix that is used to identify the multiple epidemic sources (the top ranked nodes according to a metric). The work of Zang et. al [24] considers SIR (Susceptible-Infected-Recovered) epidemics and a limited observation model by building “extended infected node sets” upon LPSI that are partitioned and ranked using a network centrality metric. Different number of epidemic sources are considered independently in order to determine the actual number of sources.

While the above approaches are based on likelihood functions and label propagation, learning-based approaches have been recently proposed to tackle the MSD problem. A prominent example is Graph Convolutional Networks based Source Identification (GCNSI) [3] where a GCN model is trained with supervision to generate node representations.

The approach uses LPSI as part of the node attributes and considers different epidemic models beyond SI (Susceptible-Infected), such as IC (Independent Cascade) and SIR. However, different configurations of GCN need to be trained for different networks. GCNSI is the closest approach to the framework designed in this paper, and results will be compared directly for scenarios that are available (see Sect. 5).

4 Proposed Framework

Recall that, in Graph Neural Networks, node attributes serve as the initial representation (input) to the neural network that generates the node representation. While the epidemic state of the node is the main node attribute, it is insufficient information to adequately distinguish nodes when it comes to latent representation generation, whereas sources representation should strongly distinguish from the other nodes. The framework here proposed as well as the evaluation scenarios are publicly availableFootnote 1

4.1 Metrics and Attributes

The idea behind the proposed attributes is to reflect the epidemic state around the node in their k-hop neighborhood. Intuitively, the epidemic state of the k-hop neighborhoods of source nodes should be similar but different from the epidemic state of k-hop neighborhoods of nodes infected late during the epidemic.

Therefore, two metrics are proposed to capture the epidemic state of k-hop neighborhoods: Ring Infection (RI) and Depth Ring Infection (DRI). The former leverages the average epidemic state of nodes at distance k, while the latter leverages the average neighboring epidemic state of nodes at distance k. Given the snapshot \(I(t_o)\) and considering a node u, the RI metric is given by:

$$\begin{aligned} \alpha ^u_k=\frac{\sum _{v \in N_k(u)} I_{v}}{|N_k(u)|} , \text{ for } k=0,\ldots ,K , \end{aligned}$$
(2)

\(N_k(u)\) denotes the k-hop neighborhood of node u, or equivalently, the set of nodes at distance k from node u, and \(I_v\) the epidemic state of node v in the snapshot \(I(t_o)\). In particular,

$$\begin{aligned} I_v ={\left\{ \begin{array}{ll} 1 \text{, } \text{ if } v \in I(t_o) \\ 0 \text{, } \text{ otherwise } \end{array}\right. } \end{aligned}$$
(3)

Note that \(\alpha ^u_0 = I_u\) is simply the epidemic state of node u. Also, K is the parameter that determines the maximum distance to be considered in the attribute generation.

The DRI metric depends on the epidemic state of the neighbors of nodes that are at distance k from node u (and not just the epidemic state of the node). In particular, DRI is the average of RI at distance 1 among the nodes that are at distance k from node u. DRI is given by:

$$\begin{aligned} \beta ^u_k=\frac{\sum _{v \in N_k(v)} \alpha _1^u}{|N_k(u)|} , \text{ for } k=1,\ldots ,K. \end{aligned}$$
(4)

A single node can be counted multiple times in this metric, since it can be neighbor of many nodes at distance k from u. Thus, DRI provides information that is significantly different from RI.

The calculations of RI and DRI require determining the k-hop neighborhoods of all nodes in the network. For a given node u, a Breadth First Search (BFS) starting on u can efficiently determine the k-hop neighborhoods, in linear time \(O(|V|+|E|)\). Since k is often small when compared to the network diameter, the BFS stops before reaching all network nodes. RI and DRI for different nodes are computed fully in parallel, as there are no dependencies. This allows for both time and memory efficient calculation.

Finally, RI and DRI are used to determine the attribute vector for each node. In particular, node u has an attribute vector that is the concatenation of \(\alpha _u\) and \(\beta _u\) for all k whose dimension is \(2K + 1\) (since \(\alpha _u\) has \(K + 1\) values and \(\beta _u\) has K values). This is the initial representation (input) to the GNN model.

4.2 Graph Neural Network Model

Graph Neural Networks (GNNs) integrate structural information with attribute information (node labels) in order to generate representations for network nodes. The GNN is a stack of neural network layers whose input is the node representation along with an aggregated representation of its neighbors, that can take several forms such as averaging. In practice, the input to the first layer corresponds to node attributes, while the output of the last layer corresponds to the node representation. The implementation discussed here adopts a small number of layers, e.g., three layers, to avoid diluting the signal of the node’s neighborhood.

While being possible to make GNN models learn representations without supervision, GNNs may also be trained in a supervised fashion. In this case, the output of the last layer is used to compute a loss function that then drives the training of the model parameters. This work uses GNN in a classification task whose final goal is to classify an epidemic network node as being an epidemic source. Therefore, a sigmoid function is applied to the output of the last (third) layer in order to generate values between 0 and 1. The first and second layers use the ReLU as activation function together with a dropout layer to help avoid overfitting. The layers adopted are SAGE layers proposed by GraphSAGE, given its wide applicability and efficiency [9].

4.3 Loss Function

Since the problem consists of binary classification, the binary cross entropy function was adopted as the loss function. In particular, for each node u, it follows:

$$\begin{aligned} \ell _u(I_0) = c_u \log \sigma \left( y_u\right) +\left( 1-c_u\right) \log \left( 1-\sigma \left( y_u\right) \right) , \end{aligned}$$
(5)

where \(c_u\) indicates if node u is an epidemic source (\(u \in I_0\), the ground truth), \(y_u\) is the output of the neural network for node u, and \(\sigma \) is the sigmoid function applied to the output of the neural network.

The average loss function across all nodes must consider the fact that classes are very imbalanced in this problem. For that matter, a weighted average is considered where \(w_0\) and \(w_1\) denote the weights for each class. The weights must consider the number of nodes in each class; \(w_0 = r_0/(1 - |I_0|/n)\) and \(w_1 = r_1/(|I_0|/n)\) where \(r_0\) and \(r_1\) are constants used to increase the class weight beyond the class balance ratio, and, in this work, \(r_0=1\) and \(r_1=100\) (chosen experimentally). This indicates to the model the importance of correctly classifying the epidemic source node. The average loss function is given by:

$$\begin{aligned} \ell (I_0)= \frac{w_0 \sum _{u \not \in I_0} \ell _u(I_0) + w_1 \sum _{u \in I_0} \ell _u(I_0)}{(n - |I_0|)w_0 + |I_0|w_1} \end{aligned}$$
(6)

4.4 Datasets and Training

An arbitrary large and diverse dataset with ground truth concerning epidemic sources is generated for this problem. For each sampled network, the set \(I_0\) (epidemic sources) is chosen uniformly random, and a SI epidemic is simulated until a fraction of nodes \(f_o\) is infected, giving rise to one epidemic network sample. This entire process is repeated to generate independent samples of network epidemics. Real networks are also covered by the dataset, and in this case the network is always the same (but not \(I_0\) nor the epidemic network). The dataset is then split for training, validation and testing.

This work made use of PyTorch Geometric [6], which employs PyTorch [16] and provides a number of GNN layers. The Adam optimizer was used with a low learning rate of 0.001 and all network nodes were considered as a single batch at each learning epoch. As for sampling neighboring nodes for GNN training, all nodes within a distance of 3 from the target node were considered. This sampling strategy is used to balance computational cost and network coverage as nodes closer to the target carry more important information.

The model was trained on 500 epochs and an early stopping criteria is adopted such that if no improvements in the loss function are observed in the validation network for 75 consecutive epochs, the training with the input network stops in order to avoid overfitting. The experiments were carried on a computer with an Intel Core i7-11800H, GPU NVIDIA GeForce 3070 8 GB and 16 GB RAM.

5 Evaluation

The performance of different approaches to identify epidemic sources depends on various factors, including network structure. In order to better assess the proposed framework, two random graph models and two real networks are considered. The performance of the proposed approach will be characterized using two different criteria: identifying epidemic source nodes and identifying neighbors of source nodes.

5.1 Network Models and Real Networks

The Erdős-Rényi (ER) random graph model, also known as G(np), is a classic model that has been widely studied [8]. The model yields a very homogeneous graph, with no special structure, in which each possible edge is present in the n node graph with probability p. In contrast, the Barabási-Albert (BA) model follows a iterative process driven by preferential attachment and generates a scale free network, better representing real networks such as the web and the AS Graph [1]. Networks generated by these two models are strikingly different and both will be used in the evaluation. To allow for a direct comparison, the parameters of the two random graph models are chosen such that the generated networks are connected and have the same number of nodes and edges, on average.

As for networks modeled after real life data, the Facebook ego network is a social network consisting of the ego-networks of ten Facebook users [13]. The power grid network represents the electrical power grid of the western United States with generators, transformers and substation acting as network nodes [22]. These two real networks are also very different in terms of node degree and distances. Table 1 summarizes some details on the networks considered.

Table 1. Network information (\(^*\) indicates average values).

For each random graph model and parameters determining a scenario (such as number of epidemic sources and infection ratio), 30 independent epidemic networks are generated and split equally into training, validation and testing sets. When considering real networks, 50 independent epidemic networks are generated, from which 1/5 is used for training, 1/5 for validation and 3/5 used for testing. The testing dataset is larger to ensure a better estimation of the average performance and allow a better comparison with prior works.

5.2 Evaluation Metrics

Accuracy is not an adequate metric to capture the performance of the framework, since there are very few epidemic sources (less than \(0.3\%\)) and thus a very unbalanced dataset. Metrics that capture the ability of correctly identifying the sources are more adequate, hence taking into account true positives, false positives and false negatives.

Recollect that the proposed model outputs the probability that a node is an epidemic source. Therefore, given an epidemic network, each node is used as input to the trained model and ranked according to their probability in decreasing order (most probable first). From this list, the top-k nodes are taken as epidemic sources, where k is parameter of the evaluation methodology. Using this methodology, the precision and recall values for the top-k are computed, and consequently the F-score metric. Note that k should be a function of the number of epidemic sources, which is given by \(s = |I_0|\).

Two different sets are considered as true positives when assessing the performance of the framework: i) the set \(I_0\), namely the epidemic source nodes; ii) the set \(O = \{ v | v \in I_0 \vee v \in N_1(u), u \in I_0\}\), namely the epidemic sources and the neighbors of the epidemic sources (\(N_1(u)\) denotes the neighbors of node u). Note that O considers the neighbors of the epidemic source nodes as nodes classified correctly by the framework. Obviously, identifying nodes that are neighbors of the epidemic source has much more value than nodes that are not neighbors.

5.3 Results

Fig. 1.
figure 1

F-score for all four networks for both evaluation scenarios (w/ and w/o neighbors) for different number of epidemic sources (\(s \in \{3, 5, 10, 15\}\)) and different top-k values (as a function of s). In all scenarios, \(f_0=\) 20%.

Figure 1 shows the performance of the proposed framework on all four networks (one network for each subfigure). Each graph shows the F-score for the two scenarios, considering just the epidemic sources (set \(I_0\), w/o neighbors) and epidemic sources and their neighbors (set O, w/ neighbors). Each curve corresponds to a different number of sources, \(s = |I_0|\) and the x-axis is the value for k in the k-top nodes, as a multiple of s. In all scenarios, \(f_o =\) 20%.

It also shows that increasing k improves performance on all networks on the w/ neighbors evaluation scenario in most cases. One exception being Facebook Ego, where the performance decreases when s is 5 or 10. Additionally, increasing the number of epidemic sources also improves performance on all networks on both scenarios, with the exception of BA networks. Note that for ER networks, performance clearly improves as s increases. In BA networks, having less epidemic sources yields better performance when considering w/o neighbors scenario. This occurs possibly because of the distinctive network topology where having a few epidemic sources allows for easier identification.

Fig. 2.
figure 2

Precision for all four networks for both evaluation scenarios (w/ and w/o neighbors) for different number of epidemic sources (\(s \in \{3, 5, 10, 15\}\)) and different top-k values. In all scenarios, \(f_0=\) 20%.

Interestingly, Fig. 1 shows that it is not always the case that performance is better in w/ neighbors scenario. On the Power Grid network, leveraging neighbors clearly improves performance, while for BA network the opposite is observed, and for ER networks results are mixed. Therefore, w/ neighbors has greater precision but potentially smaller recall values, as the set of true positives has increased significantly. This intuition is verified in Fig. 2.

Figure 2 shows precision values for the same scenarios as Fig. 1. Note that in all networks, precision for a fixed number of sources is always higher when considering the set O (w/ neighbors) in comparison to the set \(I_0\) (w/o neighbors). Moreover, for ER, Facebook and Power Grid networks, having more epidemic sources improves precision for the O set in all scenarios. When considering \(s=15\), precision for ER and Facebook network are above 20% and 45% for all top-k values considered, respectively, while for the Power Grid precision is always below 8%. This indicates the importance of the network structure on identifying epidemic sources.

5.4 Results with Mixed Datasets

The previous results considered specialized datasets where all samples were independent and identically distributed, all having the same parameters. Such samples were used for training, validation, and testing. However, it is also important to consider the performance of the framework when the datasets are more diverse and not composed by identically distributed samples.

A heterogeneous dataset was generated for each network model. The samples in each dataset have different networks (from the same model), different number of sources (\(s \in \{3, 5, 10, 15\}\)) and different infection rates (\(f_o \in \{ 0.1, 0.2, 0.3 \}\)). The exception are real networks where the network is always the same (but all else is simulated to generate an epidemic network). This heterogeneous dataset is used for training, validation and testing a general model.

Results comparing the performance of the framework on the homogeneous and heterogeneous datasets are shown in Fig. 3. Each subfigure shows a different infection rate, \(f_o\) and each couple of bars correspond to different number of sources, s. Performance on the homogeneous (specialized) dataset is almost always superior than performance on heterogeneous dataset. However, when the infection rate is small, e.g., \(f_o = 0.1\), the performance of the models are comparable for both ER and BA networks for different number of sources. This indicates that the framework can be effectively trained on a more general dataset when the infection rate is small. Indeed, when the infection rate is small, increasing the number of sources will still allow them to have a distinctive signature in the epidemic network.

Fig. 3.
figure 3

Performance of network models using specialized/homogeneous dataset (Individual) and generalized/heterogeneous datasets (General) for different number of sources (s) and different infection rates (\(f_o\)).

Figure 3 also shows that as the infection rate increases, performance degrades for both homogeneous and heterogeneous datasets, but falls faster for the heterogeneous. In fact, performance is zero for \(f_o = 0.3\) and \(s \in \{3,5\})\) for ER networks, showing that heterogeneous dataset trained model cannot adequately identify source nodes in more difficult scenarios. A curious exception was BA for \(f_o = 0.2\) where performance for the heterogeneous dataset was in some cases higher, \(s \in \{3,5\}\).

Figure 4 shows the performance of using the homogeneous and heterogeneous datasets when considering the two real networks. Results clearly indicate that using a heterogeneous dataset is significantly worst than using a homogeneous dataset, even in the case \(f_o = 0.1\). Results also corroborate that for a given infection rate, increasing the number of sources improves performance, for both homogeneous and heterogeneous datasets. Moreover, results also corroborate that increasing the infection rate reduces the performance, and again for both datasets. Note that performance under the homogeneous dataset is in some cases twice larger than the heterogeneous dataset, indicating that generalization is much harder for real networks.

5.5 Comparison with Baselines

Only a few prior works have evaluated the multiple epidemic source detection (MSD) problem in scenarios that are reproducible and comparable. In fact, a contribution of this work is the evaluation of the proposed framework under well-studied network models and epidemic models that can be easily reproduced.

An exception is GCNSI [3] that performed numerical evaluations on real world networks and reported their scenario with sufficient details, using F-score metric, while also reporting results for LPSI and NetSleuth for the same scenarios. Table 2 compares F-score results for 3, 5 and 10 epidemic sources when 30% of the network is infected, \(f_o = 0.3\), for GCNSI, LPSI and NetSleuth approaches. Recall that only GCNSI uses Graph Convolution Networks. Results for the proposed framework are shown in the row SAGE with maximum and average values across different values for k in the top-k ranking.

Fig. 4.
figure 4

Performance of real networks using specialized/homogeneous dataset (Individual) and generalized/heterogeneous datasets (General) for different number of sources (s) and different infection rates (\(f_o\)).

Table 2. F-score values of different frameworks on real networks with 3, 5 and 10 epidemic sources. GCNSI, LPSI and NetSleuth results extracted from [3].

Note that results are generally poor across all table entries since reported F-score values are much closer to zero than to one, indicating that the MSD is rather a difficult problem. However, results in general improve with the increasing number of epidemic sources. Note that SAGE stands out showing the best performance across the table for all scenarios, for maximum and average performance. Also, the gap between SAGE and the other models increase along with the number of sources: SAGE’s F-score on Facebook Ego with 10 sources are five times larger than the second best model.

6 Conclusion

Identifying multiple epidemics source nodes from the observation of an epidemic network is a challenging problem that has recently been explored in the literature. This work proposed a framework based graph neural networks that starts by generating features for nodes from the observed epidemic network. These features capture the infection state of k-hop neighborhoods around the node. Finally, a SAGE supervised trained model is used to rank nodes of a (never before seen) epidemic network with respect to their probabilities of being an epidemic source for that observation. The proposed model is generalizable and can be applied to any network, any SI contagion model, and any number of epidemic sources.

The performance of the proposed framework was evaluated on two random network models and two real networks, along with two evaluation scenarios (source nodes and source nodes with their neighbors). Results highlight the importance of the network structure on the framework’s performance. Besides, results indicate that performance improves when the number of sources is larger, and decreases as the fraction of infected nodes increases in the observation. Results also indicate that a single model does not perform well when considering various scenarios (with the exception of low infection rates and random network models). Finally, the proposed framework significantly outperformed recent prior works in all scenarios where a direct comparison was possible.

In general, reported values for F1-score and other metrics (precision and recall) are low both here and in prior works, but this is an indication that detecting multiple epidemic sources is a challenging problem. Future works can explore node features and classification model in search of more promising approaches.