key: cord-0172059-u0ve61kj
authors: 'Odor, Gergely; Vuckovic, Jana; Ndoye, Miguel-Angel Sanchez; Thiran, Patrick
title: Source Detection via Contact Tracing in the Presence of Asymptomatic Patients
date: 2021-12-29
journal: nan
DOI: nan
sha: 1a08e4d9aa52a3cd97bab4954452a76fbeffd985
doc_id: 172059
cord_uid: u0ve61kj

Inferring the source of a diffusion in a large network of agents is a difficult but feasible task, if a few agents act as sensors revealing the time at which they got hit by the diffusion. A main limitation of current source detection algorithms is that they assume full knowledge of the contact network, which is rarely the case, especially for epidemics, where the source is called patient zero. Inspired by recent contact tracing algorithms, we propose a new framework, which we call Source Detection via Contact Tracing Framework (SDCTF). In the SDCTF, the source detection task starts at the time of the first hospitalization, and initially we have no knowledge about the contact network other than the identity of the first hospitalized agent. We may then explore the network by contact queries, and obtain symptom onset times by test queries in an adaptive way. We also assume that some of the agents may be asymptomatic, and therefore cannot reveal their symptom onset time. Our goal is to find patient zero with as few contact and test queries as possible. We propose two local search algorithms for the SDCTF: the LS algorithm is more data-efficient, but can fail to find the true source if many asymptomatic agents are present, whereas the LS+ algorithm is more robust to asymptomatic agents. By simulations we show that both LS and LS+ outperform state of the art adaptive and non-adaptive source detection algorithms adapted to the SDCTF, even though these baseline algorithms have full access to the contact network. Extending the theory of random exponential trees, we analytically approximate the probability of success of the LS/ LS+ algorithms, and we show that our analytic results match the simulations. Finally, we benchmark our algorithms on the Data-driven COVID-19 Simulator developed by Lorch et al., which is the first time source detection algorithms are tested on such a complex dataset.

During the COVID-19 pandemic, we have seen a revolution of the contact tracing technology, which helped track and contain the epidemic [6, 19] . Some contact tracing programs were conducted by governmental/health agencies [30] , while others relied on decentralized approaches [41] . Most contact tracing approaches work by notifying people who could have received the infection from known infectious patients, i.e., they trace "forward" in time. However, some advocate that a "bidirectional" tracing, where the past history of the infection is also tracked, can be more effective [5, 9, 18] . In this paper we focus on the "backward" direction of the problem; the task of identifying the first patient who carried the disease, also called patient zero, or the source of the epidemic. The identification of patient zero can either be limited to a smaller population cluster, in which case Authors' addresses: Gergely Ódor, gergely.odor@epfl.ch, EPFL, Lausanne, Switzerland; Jana Vuckovic, jana.vuckovic@ epfl.ch, EPFL, Lausanne, Switzerland; Miguel-Angel Sanchez Ndoye, miguel-angel.sanchezndoye@epfl.ch, EPFL, Lausanne, Switzerland; Patrick Thiran, patrick.thiran@epfl.ch, EPFL, Lausanne, Switzerland. it can be a first step towards "bidirectional" tracing, or it can be more ambitious; finding the first patient who developed the mutation of a certain disease can help understanding how the mutation occurred, which can help us prevent, or better prepare for future epidemics.

Surprisingly, given the importance of the problem and the relatively large literature on the topic, we are not aware of any instance where source detection algorithms have been applied in real situations, including during the COVID-19 pandemic. Our goal in this paper is to examine the applicability of the source detection models in the literature (which we call frameworks from now on), and then propose a new framework, which improves them in several aspects. Originally, source detection was introduced in the context of rumor spreading instead of epidemics by Zaman and Shah in their pioneering Sigmetrics paper [32, 33] . Translating to the language of epidemics for clarity, in the framework of [33] , an epidemic spreads over a network of agents that is completely known to us, and we observe a snapshot of the network, which means that every agent reveals if they are infected or not at some given time (not too early, because then the problem is trivial, nor too late, because then the problem is impossible). Shortly after [33] , Pinto et al. proposed a different framework, in which agents (also called sensors) reveal, in addition to their state, the time when they became infected, but where only a few of them do so and act as sensors [31] ; indeed, the problem is trivial if all agents are sensors. This framework is better tailored to epidemics, as it is reasonable that obtaining any information from all the agents is much harder than asking one more question about the starting time of the symptoms of the disease to only some of them. Pinto et al. found that in their framework, if the sensors are already selected, the maximum likelihood estimator of the source has a closed form solution when the underlying network is a tree, and the time it takes for an agent to infect one of its susceptible contacts follows a Gaussian distribution. For general graphs, it is difficult to find an algorithm with any theoretical guarantees, although we implementations of contact tracing algorithms, we propose a new framework for source detection, which we call Source Detection via Contact Tracing Framework (SDCTF). In SDCTF, algorithms can have two types of queries: contact queries, which can be used to explore the network, and sensor (test) queries, after which agents reveal their symptom onset time as before. The goal of the algorithm is to find the source as accurately as possible, while minimizing the number of contact and sensor queries. The SDCTF is a way to formalize the source detection task; it determines the goal of the algorithm and how information can be gained about the epidemic, but it does not specify the underlying epidemic and mobility data models (simulated or real). In this paper, we analyse different algorithms in the SDCTF with various epidemic and mobility models.

Besides specifying the possible queries that algorithms can make, the SDCTF also determines the way the outbreak is detected, which marks the starting time of the source detection task. In sensor-based source detection, the source detection task often starts long after the outbreak, when essentially all agents in the network are infected [31] , which can be seen as a limitation of source detection frameworks. The SDCTF is also closely related to contact tracing frameworks, where it is standard to assign a probability that each node spontaneously self-reports after developing symptoms, which triggers the activation of contact tracing algorithms [5, 19] . In the SDCTF, we adopt the idea of self-reporting with a slight modification. We believe that the most interesting time to perform the source detection task is when a new disease (or a new mutation of the disease) appears, and therefore we tie these self-reporting events to hospitalizations, where infections are properly diagnosed by healthcare professionals. In particular, this means that the SDCTF can only be applied to epidemic data (and models) where hospitalizations are well-defined. In this paper, we use the datasets generated by the Data-driven COVID Simulator (DCS) introduced in [23] , which is one of the most realistic toolboxes that generate datasets modelling COVID-19, which we are aware of (notably, hospitalizations are part of the model). We also propose synthetic approximations for the epidemic and mobility models in the DCS; the Deterministically Developing Epidemic model and the Household Network Model, which improve the interpretability of our results since they have fewer parameters.

We propose a simple algorithm called LocalSearch (LS), which adaptively traces back the transmission path from the first hospitalized patient to the source. The LS algorithm is quite efficient at finding the source; the number of contact and sensor queries that it uses does not depend on the size of the network, but only on the local neighborhood of the source. Moreover, the LS algorithm provably finds the source with 100% accuracy, because of our assumption that every contact and sensor query is answered without noise. However, it is well-known that data-availability is a major issue in contact tracing [3] , either because the agents do not comply with contact tracing efforts, or possibly (and in particular in the current COVID-19 epidemic) because they do not develop symptoms, and are unaware that they have the disease. In this paper, we model the effect of asymptomatic agents. When queried and tested, these agents do not reveal their time of infection, only whether they have or had the disease at some point. We show that the accuracy of the LS algorithm drops in the presence of asymptomatic agents, because the algorithm can get stuck while tracing back the transmission path from the first hospitalized patient to the source. Therefore, we propose an improved version of LS called LS+, which accounts for the presence of asymptomatic agents by placing more sensors. We are not aware of any previous work in the source detection literature that models the effect of asymptomatic patients, but the resulting model can be seen as a mix between the snapshot and the sensor-based models. We mention that non-complying agents or agents who provide noisy observations have been studied by [1, 12, 24] . Non-complying agents could also be included in our framework by treating them as asymptomatic agents (even though in this case we have no information about whether the agent had the disease or not), without jeopardizing the correctness of our algorithms. . Orange (resp., red; black) nodes mark symptomatic non-hospitalized (resp, asymptomatic; symptomatic hospitalized) nodes. (d)-(f) shows the LS source detection algorithm introduced in Section 2.2, which succeeds in this example because there are no asymptomatic nodes on the transmission path between the first hospitalized node and the source. Black edges show the queried edges, and black stroke marks nodes already discovered by the algorithm. A node with black X marks a negative test result, and red stroked node marks the node currently maintained as source candidate by the LS algorithm.

We benchmark the LS and LS+ algorithms in both our data-driven and our synthetic epidemic and mobility models, and we compare them to state-of-the-art adaptive [37] and non-adaptive [16, 22] algorithms tailored to the SDCTF, whenever possible. We find that both LS and LS+ outperform these baseline algorithms in accuracy (probability of finding the correct source).

While the LS/LS+ are designed to be simple algorithms, their theoretical analysis is quite challenging. Nevertheless, we are able to provide rigorous results about the success probability of both algorithms after a series of simplifications to the epidemic and mobility models, by extending some recent results on the theory of exponential random trees [11, 25] , which have previously not been connected to the source detection literature. We present these theoretical results in Section 4, after formally introducing the SDCTF, our models and the LS/LS+ algorithms in Section 3. By simulations, we show that our analytic results approximate the accuracy of the algorithms well, even in the most realistic setting in Section 5. Our analytic results provide additional insight into how the parameters of the epidemic and mobility models affect the performance of the algorithms. We discuss these insights along with some non-rigorous computations that mirror our main proof ideas in Section 2. Reading Section 2 before Sections 3-5 is useful to build intuition, but is not necessary to understand the paper.

Let us consider a time-dependent network model, where each agent meets new agents each day in such a way that the contact network is an infinite tree (ignoring the label of the edges giving the propagation time along the edge). This network models homogeneous mixing in a very large population; we consider more realistic network models in Section 3. On this network, we consider an epidemic model that starts at = 0 with one infected agent, and then progresses as infected agents infect their susceptible contacts each independently with probability each day. Since our goal is to study the epidemic process, it is sufficient to track only the agents who are already infectious (also called internal nodes), and the agents who are in contact with infectious agents at time (also called external nodes), as shown in Figure 1 (a)-(c). For = 1, the spread of the infection is then equivalent to the growth a random tree T rooted at the source of the infection, known under the name of Random Exponential Recursive Tree (RERT) and recently introduced in [25] . Because of the similarities of the models, we refer to the model with general as RERT in the remaining of this section. We point out that the standard literature on elementary branching processes such as Galton-Watson trees or random recursive trees [8] is not applicable in our scenario, because these branching processes have no notion of global time (i.e., a node in such processes becomes infectious immediately after receiving the infection), whereas nodes in diseases commonly go through an exposed, non-infectious period before becoming infectious, which is well captured by the RERT model. We mention that there is literature on more advanced branching processes that do have a notion of global time, e.g. Crump-Mode-Jagers trees [15] , however we opt for the RERT because of its simple definition.

After a node (patient) becomes infected, the disease can take three courses (which for now do not affect T ): with probability the patient is asymptomatic, with probability (1 − ) ℎ the patient is hospitalized, and with probability (1 − ) (1 − ℎ ) the patient recovers without hospitalization. The governmental/health agency learns about the outbreak when the first hospitalization occurs (see Figure 1 (c)) and starts the source detection process right away. It can inquire about the contacts of each agent and it can test the agents. From patients that were symptomatic (at any point in time in the past), the agency learns about their symptom onset time (which, in this simple model, is always one day after the infection time), but from asymptomatic patients it only learns that they had (or have) the disease at some point when they are tested. The framework introduced in this paragraph (including both the detection of the outbreak through the first hospitalization, and the possible actions the agency can take) is a simplified version of the SDCTF (Source Detection via Contact Tracing Framework), introduced in Section 3.3.

The network and epidemic models introduced in this section have four parameters: , , , ℎ , and it is important to understand how each of them affects the difficulty of source detection in the SDCTF. We distinguish two important factors. First, if the outbreak is not detected rapidly enough, the length of the transmission path to the first hospitalized agent is long, and source detection becomes then difficult, because a lot of information needs to be recovered. Therefore, a low , a low ℎ and/or a high parameter can hinder source detection (recall that the probability of hospitalization was ℎ (1 − )). The second factor is related to the difficulty of recovering information about the transmission path. If is high, then there are a lot of nodes who are asymptotic and therefore do not reveal their symptom onset time, making source detection very difficult. Since affects both the length of the transmission path and the amount of collected information, it is safe to expect that, of all parameters, has the largest effect on the difficulty of source detection. The parameter is interesting, because a large can reduce the length of the transmission path, but it also makes the information about the transmission path less accessible as more agents need to be tested. Since in this paper we do not set a hard constraint on the total number of available tests, the advantage of a shorter path takes over the drawback of additional tests and a large increases the success probability.

To say anything quantitative about source detection in the SDCTF, we must discuss specific algorithms that solve the source detection task. In this paper we propose a simple algorithm called LocalSearch (LS), shown in Figure 1 (d)-(f). The LS algorithm maintains one candidate node at each iteration (initially, the first hospitalized node), which is always symptomatic, and it updates it in a greedy way: at the time of the infection of , all its incident edges are queried, and all its neighbors are tested. Then the agent with the lowest reported infection time will be the new candidate . The algorithm stops when does not change anymore between two consecutive iterations. For simplicity, we assume that the infection does not spread any further during these iterations, however, this assumption does not affect the ability of the algorithm to find the source or not. Indeed, it is not difficult to see that on tree networks, LS succeeds if and only if there are no asymptomatic nodes on the transmission path from the source to the first hospitalized agent. This observation leads us to enhance the LS algorithm by also searching within the neighbors of asymptomatic nodes; we explore this idea in the LS+ algorithm introduced in Section 3.5. We are not aware of this simple greedy algorithm being studied in the context of source detection, although similar ideas were implemented for non-adaptive source detection to lower the runtime of the algorithms [28] .

Now, we have all the tools to estimate the probability of success of the LS algorithm. First we condition on the course of the disease in the source. With probability , the source is asymptomatic and LS can never succeed. With probability (1 − ) ℎ , the source itself becomes hospitalized, and LS always succeeds. Finally, with probability (1 − ) (1 − ℎ ) the source is symptomatic but not hospitalized, which we call event A. If event A happens, then LS may or may not succeed depending on whether there are any asymptomatic nodes on the transmission path. More precisely, conditioned on event A and on the transmission path having length , the probability of success is (1 − ) −1 (since there are − 1 nodes on the path which can be asymptomatic), which implies

(1) The difficult part is to compute the distribution of the transmission path conditioned on event A; indeed we already saw that all four parameters , , , ℎ affect this distribution in a non-trivial way. Let us perform a back of the envelope computation to get more insight into the effect of these parameters. The exact structure of the infection tree will not matter for this computation, only its profile does. It is denoted by T ( ) and defined as the number of (internal) nodes at level (i.e., at distance from the source of the infection). Remember that by definition the RERT has · T −1 ( − 1) external nodes on level , and that at time each external node is promoted to be internal with probability to form T . Consequently, the level of a node ℎ added at time > 0 has the same distribution (conditioned on the tree T −1 at the previous step) as the size (number of internal nodes) of the profile T −1 ( − 1), that is,

Working on the RERT directly can be a daunting task, therefore we propose to approximate the numerator and the denominator of equation (2) by E[T −1 ( − 1)] and E| [T −1 |], respectively. It can be shown by a simple inductive argument, or by generating functions as in [25] , that for RERTs we have E[T ( )] = ( ) and E[|T |] = (1 + ) , which suggests a binomial distribution for the level of ℎ. And indeed, we can approximate the distribution of the level of a node ℎ added at time as

. One of the main challenges of this calculation is that we do not know the day of the first hospitalization conditioned on event A, we only know that each node is hospitalized with probability (1 − ) ℎ , which means that the index of the first hospitalized node follows a geometric distribution with mean 1/((1 − ) ℎ ). We approximate − 1 by the first time that the expected size of the infection tree (excluding the source since we condition on event A) exceeds the expected index of the first hospitalized node. Therefore we solve

for (relaxing the constraint that is an integer), which gives

Consequently, we approximate P (transmission path has length | A) by P(Bin( − 1, ) = − 1). Continuing equation (1) , and using the well-known expression of the probability generating function of the binomial distribution, we get

One can check that this expression agrees with our qualitative intuition. However, it is not at all clear whether it is valid because of the strong approximations made in some steps of the above computation. In Section 4, we prove a rigorous upper bound on the success probability, and we also provide much more careful approximations by proving exact theorems about the simplified models that we use. Then, in Section 5 we compare our results with simulation results on synthetic data, as well as with data generated by the DCS model. We call DCS the model implemented by [23] . The DCS model is fairly complex, and we only give a brief overview.

Each agent in the agent set can be in one of 8 states: susceptible, exposed, asymptomatic infectious, pre-symptomatic infectious, symptomatic infectious, hospitalized, recovered or dead.

Transitions between different states are characterized by counting processes described by stochastic differential equations with jumps. The most important, and also most complicated of these counting processes is the exposure counting process ( ), which is modeled by a Hawkes process for each agent . Hawkes processes are point processes with a time-dependent, self-exciting conditional intensity function * ( ). * ( ) = ∑︁

where the kernel , ( ) indicates whether has been at time at the same site where is at time , and whether is in the infectious state. Parameters and are the decay of infectiousness at sites and the non-contact contamination window, respectively, and they account for the fact that can infect even if they are never at the same site, as can leave some pathogens behind (airborne for instance). Parameter is the transmission rate for symptomatic and asymptomatic individuals, and it comes in two versions: accounts for infections outside the household and ℎ accounts for infection in the household. Parameters and ℎ are fitted to the COVID-19 infection data of Tubingen from 12/03/2020 to 03/05/2020 using Bayesian Optimization. The model also has a parameter for the relative asymptomatic transmission rate built into the function , ( ), which scales down the infectiousness of asymptomatic agents (to 55% of the infectiousness of symptomatic agents by default).

Once a susceptible agent becomes infected, the disease can take three possible courses (see Figure 2 (a)). With probability , the agent becomes asymptomatic infectious after time , and then recovers after time . With probability 1 − , the agent becomes pre-symptomatic infectious after time , next symptomatic infectious after time , and then recovers with probability 1 − ℎ after time − , or becomes hospitalized with probability ℎ after time . Agents in the DCS are also assigned age values based on demographic data, and the hospitalization probability ℎ of each agent is determined based on its age (following COVID-19 infection data). The times , , and are drawn from an appropriately parametrized (using values from the COVID-19 literature) lognormal distribution as shown in Table 3 .

3.1.2 The DDE Model. We start by taking the DCS model [23] , which we simplify to enable its theoretical analysis. In the Deterministically Developing Epidemic (DDE) model, continuous time (used in DCS) is replaced by discrete time-steps: we refer to one time-step in the DDE as one day. Instead of modelling the infection propagation as a Hawkes process, an infectious agent (symptomatic or asymptomatic) can infect its susceptible neighbor with probability each day. Thereafter, the disease progresses the same way as in the DCS, except that in the DDE model the transition times are deterministic (the infection events and the severity of the disease (i.e., the (a)symptomatic and hospitalized states) are still determined randomly), and we have a single parameter ℎ for the hospitalization probability (agents in this model do not have an age parameter). We discuss how we set the parameters of the DDE model in Section 3.4.

We briefly review the mobility model introduced in [23] , and illustrated in Figure 2 (b). The population is partitioned into households of possibly varying size (usually between 1 and 5). The households are assigned a location, and we also place some external sites (shops, offices, schools, transport stations, recreating sites) on the map, which the agents may visit. The location of the households and the number of agents in them is sampled randomly based on demographic datasets. Initially, each agent is assigned a few favorite sites (randomly based on distance), and will only visit these throughout the simulation. Each agent decides to leave home after some exponentially distributed time, visits one of its (randomly chosen) favorite sites, and comes back home after another (usually much shorter) exponentially distributed time. If two agents visit the same site at the same time, or within some time , we record them as a contact, which gives an opportunity for the infection to propagate. We denote the Tubingen mobility model as TU, and the DCS epidemic model that runs on the TU mobility model as DCS+TU.

Model. The Household network model (HNM) was inspired by [23] , however we note that similar models have been studied in the theoretical community by [2] . As in the Tubingen mobility model, in HNM nodes are assigned into households, but of constant size ℎ + 1. Every pair of nodes in the same household are connected by an edge, forming therefore cliques of size ℎ + 1. Additionally, each node is assigned half edges, which are paired uniformly at random with other half-edges in the beginning. Some half-edge pairings can result in self-loops or multi-edges, which are discarded. This construction defines a random graph generated by a configuration model, which shares a lot of similarities with Random Regular Graphs (RRG) [42] . In fact, if we join nodes in the same household into a single node in the HNM (which we refer to as the network of households of the HNM), then the resulting graph is equivalent to the pairing model of RRGs with degree ( ℎ + 1). It is well-known that in the pairing model of RGGs of degree , the local neighborhood (of constant radius, as the number of nodes tends to infinity) of a uniformly randomly chosen vertex is a -regular tree (with probability tending to 1), which implies that locally there are asymptotically almost surely no self-loops, multi-edges or any cycles in the graph. This result has various names; in random graph theory the result is usually proved by subgraph counting [42] , in probability theory it is the basis of branching process approximations [2] , and in graph limit theory it is called the local convergence to the infinite -regular tree [4] . In our theoretical analysis, this result motivates the approximation of the neighborhood of the source in the network of households of the HNM by an infinite ( ℎ + 1)-regular tree. The HNM itself is then approximated by replacing each (household) node of the infinite ( ℎ + 1)-regular tree of households by a ( ℎ + 1)-clique, and by setting the edges so that each (individual) node has degree exactly + ℎ , while keeping the connection between cliques unchanged (see Figure 2 (c) for a visualization).

Since the HNM is a time-independent graph, we adopt the standard notations from graph theory. Formally, the HNM is given by the set of nodes and edges = ( , ). Let us denote by ( ) the set of nodes that are in the same household as node . The distance between two nodes , ∈ (denoted by ( , )) is defined as a number of edges of the shortest path between and . We denote the DDE epidemic model that runs on the HNM network as DDE+HNM.

We present the Source Detection via Contact Tracing Framework (SDCTF), which can be applied to both epidemic and mobility models presented so far. The framework determines how the government/health agency, which conducts the source detection task, learns about the outbreak, and how it can gather further information to locate the source. In the SDCTF, as in Section 2.1, the agency learns about the outbreak when the first hospitalization occurs, and it also learns the identity of nodes when they become hospitalized (including the identity of the first hospitalized node).

After the outbreak is detected, the agency can make three types of queries. The first type of query, the household query with parameter , reveals the agents that live in the same household as . The household query works the same way in both the TU and the HNM models, and we do not limit the number of times it can be called (these queries are considered as cheap in the SDCTF). The second type of query, the contact query, works differently in the TU and the HNM models. For the TU model, a contact query has two parameters: an agent and a time window [ 1 , 2 ]. As a result, all agents that have been in contact with (and therefore could have infected or could have been infected by ) at an external site between 1 and 2 are revealed. In the HNM, no time window is needed for the contact query (which we also call edge query), and all neighbors of in graph are revealed. Contact (and edge) queries are considered expensive in the SDCTF. While in this paper we do not limit the number of available queries, we track the number of contacts and edges that are revealed as the algorithm runs. Note that in the TU model if two agents 1 and 2 have been in contact during the time window [ 1 , 2 ] and also during a different time window [ 3 , 4 ], then those are counted as separate contacts, whereas in the HNM an edge between 1 and 2 is only counted once. Although contact queries are considered expensive, both household and contact queries are answered instantly in the SDCTF.

The third kind of query is the test query with parameter , which reveals information about the course of the disease in the queried agent (see Figure 2 (a)). Symptomatic patients reveal the time of their symptom onset (which exactly determines their time of infection in the DDE due to the deterministic transition times) if they are past the pre-symptomatic state (i.e., if they are either infectious or recovered). Asymptomatic and pre-symptomatic patients do not reveal any information about their infection time; they just reveal that they have the disease or had the disease at some point and have recovered. For all algorithms we assume that asymptomatic patients do not reveal whether they have the infection at the time they are queried. Finally, agents who have not been exposed, or are still in their exposed state, give a negative test result. Test queries are again considered expensive in the SDCTF, we even limit the population that can be tested on any given day to at most 1% of the total population, due to the capacity of testing facilities. However, since in this paper we do not limit the number of days that the algorithm can use to locate the source, the limit on the number of tests does not play an important role. As opposed to household and contact queries (and the model in Section 2.1), tests results are only answered the next day in the SDCTF, which means that the algorithms must operate in "real-time", while the epidemic keeps propagating.

The DCS+TU model has many parameters, most of which are fitted to COVID-19 datasets of Tubingen from 12/03/2020 to 03/05/2020 by [23] (we show the most relevant parameters in Table  3 ). We determined the parameters of the DDE+HNM model so that they fit the parameters of the DCS+TU as closely as possible (see the precise values in Table 3 ). We determine the values of , , in the DDE+HNM by rounding the expected value of the corresponding distribution in the DCS+TU to the nearest integer. Since is simply a constant in both models, we keep the same numerical value in the DDE+HNM. The parameter ℎ is more complicated, because in the DCS+TU model there is a different hospitalization probability for each age group. We take the average hospitalization probability across the population to be ℎ . The most complicated parameter to fit is , because in the DCS+TU model, infections are modelled by a Hawkes process, which depends on many parameters, including whether the infectious agent is symptomatic or asymptomatic, the length of the visit, the site where the infection happens, etc (see equation (4)). We empirically observe the probability of infection in every contact in several simulations, and we find that an agent has on average 15 contacts outside the household each day, and that the average probability of infection during such a contact is around 0.02. However, since we use smaller networks for the DDE+HNM ( = 400 or 1000, because running the baselines on larger networks is not feasible) than the DCS ( = 9054), setting to be as high as 15 would violate the assumption that the network of households of the HNM can be locally approximated by a tree (see Section 3.2.2). Therefore we chose = 3 for the HNM and we scale so that (the expected number of external infections caused by a single agent each day) is the same in the DCS+TU and the DDE+HNM models. Finally, we choose ℎ in the DDE+HNM by rounding the average household connections in the DCS+TU. Note that the average number of household connections is not the same as the average number of household members, because the number of connections grows quadratically in the size of the households, and thus fitting to the number of connections results in a higher (due to the Quadratic Mean-Arithmetic Mean inequality).

Finding the default values for the parameters is useful to create a realistic model. However, we also interested in the effect of each of the parameters on the performance of our algorithms. Therefore, in the DDE+HNM, we vary the parameters , ℎ , , ℎ and , while keeping the other ones unchanged. For the DCS+TU model, we also keep the mobility model fixed and we focus on varying the parameters , ℎ and . As noted above, there is no single parameter ℎ or in the DCS+TU model, therefore we change all hospitalization probabilities and all intensities of the Hawkes processes so that the hospitalization probability averaged across the population and the infection probability averaged across contacts equal the desired values.

The LS algorithm finds patient zero by local greedy search. It keeps track of a candidate node, which is always the node with the earliest reported symptom onset time. We denote the candidate of the algorithm at iteration > 0 by , . We think of as a list, which is updated in each iteration of the algorithm, and we use the notation ,−1 for the last element of the list (i.e., the current candidate). In each iteration of the algorithm, we compute a new candidate denoted by ′ , and we append it at the end of the list at the beginning of the next iteration, unless ′ = ,−1 , in which case the algorithm terminates. 4 . Pseudocode and graphical explanation for the LS and LS+ algorithms. We use the same coloring as in Figure 2 (a). Black edges show the queried edges, a node with black X marks a negative test result, and red stroked node marks the node currently maintained as source candidate by the LS algorithm. We denote by the symptom onset time of symptomatic node and by ( ) the household of a node similarly to the main text.

Since we consider the SDCTF, the outbreak is detected when the first hospitalized case is reported. At that time, ′ is initialized to be the hospitalized patient, the test queue is initialized to be empty, and the algorithm is started. In the beginning of an iteration, if the test queue is empty, the household members and the "backward" contacts of the current candidate ,−1 are queried and are added to the test queue (see Figure 4 (a)). We define "backward" contacts as the set of nodes that have been

where ,−1 is the symptom onset time of current candidate ,−1 . The terms and model the standard deviation of the transition times, and they are set to zero for the DDE and to = 2 and = 1 for the DCS based on Table 3 . We note that the notion of "backward" contacts is only meaningful in the case of time-dependent network models; for the HNM, all neighbors are counted as backward contacts.

After the test queue is initialized, the agents inside the queue are tested (see Figure 4 (b)). Not all nodes can be tested on the same day because of the limitation on the number of tests available per day in the SDCTF, however, this has little effect because we do not proceed to the next iteration until the test queue becomes empty. Once the test results come back to the agency, if any of the (symptomatic) nodes reports an earlier symptom onset time than the current candidate ,−1 , then we update our next candidate ′ to be (see Figure 4 (c)). We note that the iteration does not stop immediately after ′ is first updated; the iteration runs until the test queue becomes empty, and until then, ′ can be updated multiple times. This is important in the theoretical results to prevent the algorithm from getting sidetracked (see Figure 9 ). We also experimented with a version of the LS and LS+ algorithms where the iteration stops immediately once ′ is updated; we call these algorithms LSv2 and LS+v2.

The main drawback of the LS algorithm is that is gets stuck very easily if there is even one asymptomatic node on the transmission path. For this reason, we introduce the LS+ algorithm, in which we enter the backward contacts of the asymptomatic household members of ,−1 , and the household members of any asymptomatic node into the testing queue (see Figure 4 (d)-(f)). Since the symptom onset times of asymptomatic nodes are not revealed, we define backward contact in this case as any contact in the time window

where ,−1 is still the symptom onset time of the current candidate ,−1 . Indeed, in the DDE model, since ,−1 was infected at ,−1 − ( + ), if infected ,−1 , agent must have been infectious at that time, which implies that could not have been infected later than ,−1 − ( + 2 ) or earlier than ,−1 − ( + 2 + ). In the DCS model, the terms and can be subtracted and added to the two ends of the queried time window to account for the randomness in the transition times.

Both algorithms stop if the testing queue becomes empty before a node with an earlier symptom onset time than ,−1 is discovered, and both algorithm return ,−1 as their inferred source. The high level pseudocode and an illustration of the LS and LS+ algorithms are given in Figure 4 .

In this section we present theoretical results for the LS and LS+ algorithms described in Section 3.5. We follow a similar approach as in the non-rigorous computation in Section 2.2, which useful but not necessary for understanding this section. All the statements are rigorously established, and whenever we reach a point where the computations would become intractable, we propose a simpler approximate model to study. One of the main contributions of this paper is to identify which computations can be done on more general models, and which computations need more simplified ones (see Figure 5 for an overview of the different models used for the computations in this section).

We compute the success probability of the LS and LS+ algorithms in two steps. We first assume the length of the transmission path known in Section 4.1 . This computation is then made possible by a tree approximation of the HNM, called the Red-Blue (RB) tree (defined in Section 4.1.1), and a slightly modified version of the DDE model called DDE NR (defined in Section 4.1.2). The RB tree preserves some of the household structure in the HNM, and therefore allows us gain insight into the difference between the LS and LS+ algorithms, which would be difficult to obtain if we had worked on trees without taking the household structure in account.

For the second step, we would need to compute the distribution of the transmission path on the RB tree. However, finding a closed form expression is intractable. Instead, we combine the network and epidemic models into a growing random tree model, and we consider a -ary Random Exponential Tree (RET). The -ary RET model has only been studied for = 2 [11] ; we extend the results on their expected profile for general in Section 4.2.1. Nevertheless, working on -ary RETs still remains difficult, and therefore, in our last modeling step, we introduce a Deterministic Exponential Tree (DET) model, whose profile is close to the expected profile of the RET, and we compute the distribution of the transmission path on this model in Section 4.2.2.

To summarize all models considered in this paper, we have a data-driven and a synthetic model for simulations (DCS+TU and HNM+DDE), an analytically tractable model (RB-tree+DDE NR ) where we can compute the success probability if the length of the transmission path is known. In a second stage, we compute the distribution of a transmission path on a deterministic tree (DET), which has a similar profile as a random tree (RET) that approximates our analytically tractable model. We visualize these five different models in Figure 5 (a), and we show by simulations in Figure 5 (b) that the distribution of the transmission path is similar in all of the considered models with appropriately scaled parameters. We compare our analytic results on the success probabilities of the LS and LS+ algorithms with our simulation results in Section 5.3 in Figure 7 .

In this section we introduce the Red-Blue (RB) tree model (which is a tree approximation to the HNM), and we calculate the exact probability that the LS and LS+ algorithms succeed, if the length of the transmission path is known.

4.1.1 Red-Blue tree models. In short, a RB tree is a two-type branching process with a deterministic offspring distribution that depends on ℎ and . The lack of randomness in this distribution makes us adopt the formalism of deterministic rooted trees.

Definition 4.1. Let a rooted tree, denoted by ( ), be a tree graph with a distinguished node root node . Let and be two nodes connected by an edge in ( ). If ( , ) < ( , ), we say that is a parent of , otherwise is a child of . Moreover, if ( , ) = we say that is on level . An RB tree with parameters ( , ℎ ) is an infinite rooted tree, such that the nodes also have an additional color property. The root is always colored red and the rest of the nodes are colored red or blue. The root has red and ℎ blue children. Every other red node has − 1 red and ℎ blue children, and every blue node has red children and no blue children. Red nodes and their ℎ blue children partition the nodes of the RB tree ( ) into subsets of size ℎ + 1, which we call households. Remark 4.1. In the RB tree, each blue node has degree + 1, and each red node has degree + ℎ , including the root of the tree (which is the source of the epidemic, when the RB tree is combined with an epidemic model).

The RB tree can be seen as a local tree approximation of the HNM. Let = ( , ) be an HNM with parameters ( , ℎ ), and let ∈ be the distinguished source node. In Section 3.2.2 we noted that the HNM can be approximated locally around the source node by replacing each node of an infinite ( ℎ + 1)-regular tree by a ( ℎ + 1)-clique, and setting the edges so that each node has degree exactly + ℎ , while keeping the connection between cliques unchanged. Let us call this infinite graph * . Although * is not a tree, all cycles in * must be contained entirely inside the households, which implies that in each household there exists exactly one node that has the minimal distance to the source. We will refer to these nodes with minimal distance to the source as the red nodes, and we color the rest of the nodes blue. In other words, the red nodes will be the first ones in their households to be infected. Let us now delete the edges between the blue nodes in * to obtain graph ′ . We claim that ′ is isomorphic to the RB tree ( ) rooted at the source .

Indeed, since the edges between blue nodes have been deleted in * to form ′ , each blue node has + 1 red neighbors and no blue neighbor, and since the edges incident to red nodes have been unchanged, each red node has red and ℎ blue neighbors, exactly as in the definition of RB tree above.

Note that a household in * is completely characterized by only specifying the colors of the nodes: a household always consists of one red node and of its ℎ blue children. We use this characterization as a definition for households in the RB tree ′ , because it does not depend on the edges from that are deleted in * , whereas this deletion makes the original definition of a household as a clique in unusable.

Next, we make some important observations the behaviour of the LS and the LS+ algorithms on RB trees, which we prove in Appendix A.1. We start by formalizing the notion of transmission path.

Definition 4.2. Let ℎ be the first hospitalized node and be the source. We call the path ( = 0 , 1 , ... = ℎ), where is the infector of +1 for 0 ≤ < , the transmission path. Also we call the path ( , −1 , ... 1 ) the reverse transmission path.

Remark 4.2. Note that in an RB tree, each household traversed by a transmission path shares one (the red node in the household) or two (the red node of the household and one of its ℎ children in the household) nodes with this path. Moreover, the red node of a household traversed by a transmission path is followed by another red node on the path (in another household) if it is the only node of that household on the transmission path, whereas it is followed by a blue node (in the same household) if two nodes of that household are on the transmission path. We note that the statement for LS+ in Lemma 4.3 cannot be reversed, i.e., it is possible that LS+ succeeds even if among the nodes of the transmission path, there is a household with no symptomatic node (see Figure 9 (a)). Also, the proof of Lemma 4.3 does not hold if the LS+ algorithm proceeds to the next iteration at the first time ′ is updated (see Figure 9 (b)). Finally, in the proof of Lemma 4.3, we do not make any assumptions about asymptomatic patients having had the disease previously or not, which implies that we could treat non-complying agents as asymptomatic patients without jeopardizing the correctness of the algorithms.

The DDE NR Model. Focusing on tree networks is an important step towards making our models tractable for theoretical analysis, but it will not be enough; we will make two minor simplifications to the DDE model as well: we eliminate (i) the pre-symptomatic state and (ii) the recovered state, and we call the new model DDE NR (where NR stands for No Recovery). (i) The first assumption can be made without loss of generality, because the pre-symptomatic state does not have any effect on the disease propagation, nor on the success of the source detection algorithm. Indeed, according to Lemma 4.3, the success of the LS and LS+ algorithms depends only on the information gained about the transmission path, and by the time of the first hospitalization, every node on the transmission path must have left the pre-symptomatic state (since we always have < + ), even if we include it in the model. (ii) The second assumption on the absence of recovery states amounts to take → ∞, which does have a small effect on the disease propagation, however, this effect is minimal because = 14 is already quite large, and because only the very early phase of the infection is interesting for computing the success probabilities of the algorithms. Finally, this last assumption has no effect on the information gained by the algorithm since we assumed that recovered patients (who were symptomatic) can remember and reveal their symptom onset time in the same way as symptomatic infectious patients.

Assuming that the distribution of length of the transmission path is provided for us (we give an approximation in Section 4.2), the success probability of LS can be computed succinctly. We need a short definition before stating our result.

Definition 4.4. Let be the probability that a node is asymptomatic conditioned on the event that it is not hospitalized.

A simple computation shows that

Lemma 4.5. For the DDE NR epidemic model with parameters ( , , ℎ ) on the RB tree with parameters ( , ℎ ), and with computed in equation (5), we have

Proof. Let us reveal the randomness that generates the epidemic in a slightly modified way than in the definition (Sections 3.1.2 and 4.1.2). As before, at the beginning only the source is infectious, and depending on course of the disease, the source can be symptomatic and hospitalized, symptomatic but not hospitalized, or asymptomatic with probabilities (1− ) ℎ , (1− ) (1− ℎ ), , respectively. In each moment, each infectious node infects each of its susceptible neighbors with probability . If a node is infected, we reveal the information whether it will become hospitalized (which happens with the probability (1 − ) ℎ ), but if it does not become hospitalized, we do not reveal whether the node is asymptomatic or symptomatic yet. Indeed, this information is not necessary for continuing the simulation of the epidemic since we assumed that there is no difference between the infection probabilities of symptomatic and asymptomatic nodes. Thereafter, when the first hospitalized case occurs, we reveal for each infected node on the transmission path (except the last node, which we know is hospitalised; see Definition 4.2) whether it is asymptomatic or not. The only information we have about these nodes is that they are not hospitalized, which implies that the probability that a node is revealed to be asymptomatic on the transmission path is exactly the probability from Definition 4.4 computed in (5) .

By Lemma 4.3, LS succeeds if and only if each node on the transmission path is symptomatic. Conditioning on the length of the transmission path, we can compute the probability of each node being symptomatic by equation (5) as

from which (6) follows immediately. □

Computing the success probability of the LS+ algorithm is far more challenging compared to the LS algorithm, even if the distribution of the length of the transmission path is provided to us. Indeed, since the LS+ algorithm does further testing on the contacts and household members of asymptomatic nodes, it is essential to have additional information about the number of households on the transmission path. We give our main result on the LS+ in the next theorem, which we prove in Appendix A.2.

Theorem 4.6. Let be as in (5) and let S( , , ) be the set of integer values such that and have different parity and + 1 − 2( + ) ≥ ≥ 2 − ( + ). Then, for the DDE NR epidemic model with parameters ( , , ℎ ) on the RB tree with parameters ( , ℎ ), we have

Section 4.1 was dedicated to the success probability of the LS and LS+ algorithms, however, in these results, we are still missing the distribution of the transmission path length. In this subsection we address this problem by introducing simpler approximate models.

4.2.1 ( , )-ary Random Exponential Tree. When we introduced the DDE NR model in Section 4.1.2, we removed both parameters and from the DDE model (by removing the presymptomatic and the recovered states, respectively), but we kept the parameter . In this step we will rescale the time parameter to make ′ = 1 by changing ′ to be 1 − (1 − ) . Since we had = 3 by default, using ′ and ′ instead of and means that we choose 3 days to be our time unit, and the probability of infection is scaled to be the probability that the infection is passed in at least one of three days (since the RB tree is time-independent, if two nodes are connected, the infection can spread on it every day). We drop the prime from ′ and ′ for ease of notation. As a second approximation, instead of keeping track of two types of nodes (red and blue) as it is done in the RB tree, we propose to change our network model to an infinite -regular tree, where is set to be the average degree of an RB tree.

By making these two changes (tracking time at a coarser scale and simplifying the network topology to a -regular tree), the growth of the epidemic becomes equivalent to a known model, the -ary Random Exponential Tree ( -RET). Binary RETs have been introduced in [11] . We give the definition below for completeness. Definition 4.7. A -ary Random Exponential Tree ( -RET) with parameters , at time day , denoted by ( ), is a random tree rooted at node . At day 0, the tree ( ) only has its root node . Let¯( ) be the closure of ( ), which is obtained by attaching external nodes to ( ) until every internal node (a node that was already present in ( )) has degree exactly in the graph ( ). Then, +1 ( ) is obtained from¯( ) by retaining each external node with probability , and dropping the remaining external nodes.

Indeed, each node of a -RET infects a new node with probability each day, and after a sufficiently long time, the -RET becomes close to a large -ary tree. Of course, we do not want to let the -RET grow for a very long time, we only want it to grow until the first hospitalization occurs. So far we have not talked about the course of the disease of the nodes in the -RET model because we could define the spread of the infection without it. Since we still need to do one final simplification to compute the distribution of the transmission path, we defer the discussion about hospitalizations, and how the parameters and ℎ are part of the model, to Section 4.2.2. Note that by considering the -RET, we deviate from the idea of separating the epidemic and the network models; we only have a randomly growing tree, which is stopped at some time, when the tree is still almost surely finite.

So far we only did simplifications to the model, which resulted in further and further deviations from the original version. Now we will make a small modification that brings our model back closer to the RB tree, without complicating the computations too much. We still make almost all maximum degrees of the RET uniform , but we make an exception with the root, which will have maximum degree = + ℎ . This makes the maximum degree of the root the same as the degree of the root of the RB tree. We call the resulting model a ( , )-RET with parameter . Since the close neighborhood of the source has a high impact on the success probability, we found that this solution gives the best results while keeping the computations tractable.

In our computations, only the profile the infection tree will be important, which motivates the next definition. 

with −1, = 0 for all , and its expectation = E[ ].

As noted earlier, the -RET model has only been analysed for = 2 to this date. We provide the expected number , of nodes at level in day for the general case in the next theorem and corollary, which we prove in Appendices A.3 and A.4. Theorem 4.9. In the ( , )-RET with parameter , let , be as in Definition 4.8. Then

, = 0, for l > t.

Corollary 4.10. In the RET( , , ), let be the expectation of (12), as in Definition 4.8. For ≥ 0,

Tree with Parameters , ℎ and ( , ) , ∈N . In the ( , )-RET model it is still complicated to calculate the distribution of the depth of the first hospitalized node. For this reason, we approximate the RET model by a deterministic time-dependent tree with a prescribed profile.

Definition 4.11. Let ( , ) ∈N {−1}, ∈N be a two-dimensional array with , = 0 for ∈ {−1, 0} and ∈ N, except for 0,0 = 1, and with , ≥ , −1 for any and any ≥ 1. Additionally, if we define = , , then the array ( , ) must satisfy > −1 for ≥ 0. Then, we define the Deterministic Exponential Tree (DET) with parameter ( , ) ∈N {−1}, ∈N , as a time-dependent rooted tree, that has exactly , nodes on level at time . The edges between the adjacent levels are drawn arbitrarily so that the tree structure is preserved.

The formal assumptions on the array ( , ) are simply made to ensure that the DET starts with a single node at = 0, that it never shrinks on any level ( , ≥ , −1 ), and that it grows by at least one node in each time step ( > −1 ).

We have defined the DET at any given time , however, to determine the length of the transmission path, we are not interested in the DET at any given time, but only when the first hospitalization occurs. To compute the distribution of the first hospitalized node, we would like to have an absolute order on the times when the nodes are added, which we do by randomization. We say that on day , nodes are added one by one to the DET, their order given by a uniformly random permutation, and each node is hospitalized with probability (1 − ) ℎ (as in the original DDE model). When the first hospitalization occurs, we stop growing the tree, and we call the resulting (now random) model a stopped DET with parameters ( , ), , ℎ . We find the transmission path length distribution on the stopped DET in the next lemma, which we prove in Appendix A.5. 

We would like to set , so that the DET is close to the RET described in Section 4.2.1. For equation (17) to make sense, we should substitute integer values for , , however, for an approximation the equation can also be evaluated for fractional values as well.

Remark 4.4. If we substitute , = , and = in equation (17), where , is given in Theorem 4.9 and is computed in Corollary 4.10, then we get the expression

which approximates the distribution of the transmission path length in the ( , )-ary RET stopped at the first hospitalization.

5.1.1 Non-adaptive Baseline: Dynamic Message Passing. There are few sensor-based source detection algorithms that are compatible with time-varying networks in the literature [10, 14, 16] . The most promising one among these algorithms [16] has a close resemblance to the a previous work of [22] on Dynamic Message Passing (DMP) algorithms. Given the initial conditions on the identity of the source node and its time of infection, the DMP algorithm approximates the marginal distribution of the outcome of an epidemic at some later time . The algorithm is exact on tree networks, and it computes a good approximation when there are not too many short cycles in the network. Therefore, the DMP algorithm can be used to approximate the likelihood of the observed symptom onset times for any (source,time) pair. Due to its flexibility, we were able to adapt the DMP algorithm to the SDCTF (see Appendix B for more details).

Originally, the DMP was applied to the source detection problem by computing the likelihood values for all possible (source,time) pairs, and then choosing the source node from the most likely pair as the estimate [22] . However, testing all (source,time) pairs increases the time complexity of the algorithms potentially by a factor of 2 , which makes the algorithm intractable in many applications. Jiang et. al. [16] proposed a very similar algorithm to the DMP equations (which is unfortunately not exact even on trees), and solved the issue of intractability by a heuristic preprocessing step to the DMP algorithm. This preprocessing step, identifies a few candidate (source,time) pairs, by spreading the disease backward from the observations in a deterministic way (called reverse dissemination). Since we already approximate our data-driven model (DCS) by an epidemic model with deterministic transition times (DDE), it is natural for us to also implement the deterministic preprocessing step proposed by [16] . We produce 5 (source,time) pairs which are feasible for the 5 earliest symptom onset time observations (see Appendix B.3 for more details). It would have been ideal to run the algorithms for more than 5 pairs, but this was made impossible by the runtimes becoming very high. We run therefore our implementation of the DMP algorithm with the previously computed feasible (source,time) pairs as initial conditions to find the most likely source candidate.

The source estimation algorithms developed using the DMP algorithm do not specify how the sensors should be selected, and therefore place these non-adaptive sensors randomly. We refer to the resulting algorithm as random+DMP. The number of sensors is set so that it always exceeds the number that LS/LS+ would use. The simulation results are shown in Figure 6 for the DDE+HNM model. Importantly, the deterministic preprocessing step of [16] is compatible with time-varying networks, which allows us to run the algorithm for the DCS+TU model as well (see Figure 8 ).

Baseline: Size-Gain. The Size-Gain (SG) algorithm was developed for epidemics which spread deterministically [45] , and has been later extended to stochastic epidemics [37] . It works by narrowing a candidate set based on a deterministic constraint. If 1 , 2 are symptomatic observations, then is in the candidate set of SG if and only if

where is the standard deviation of the infection time of a susceptible contact. If one of the observations, say 2 , is negative, then SG uses a condition almost identical to equation (19) , except that the absolute value is dropped, since a negative observation at time 2 is only a lower bound on the true symptom onset time of 2 . These deterministic conditions are checked for every symptomatic-symptomatic or symptomatic-negative pair ( 1 , 2 ) to determine if can be part of the candidate set. Next, SG places the next sensor adaptively at the node which reduces the candidate set by the largest amount in expectation (assuming a uniform prior on the source and its infection time), and it terminates when the candidate set shrinks to a single node. Note that the SG algorithm can fail if at least one of the deterministic conditions in equation (19) is violated for some ( 1 , 2 ) because of the randomness of the epidemic.

We use the existing implementation of the SG algorithm by [37] , and adapt it to the SDCTF. We incorporate asymptomatic-symptomatic and asymptomatic-negative observations ( 1 , 2 ) the same way as symptomatic-negative are incorporated; we drop the absolute value sign in equation (19) , because an asymptomatic observation at time 1 is only an upper bound on the true symptom onset time of 1 . We impose the same daily limit to the number of sensors that can be placed by the SG algorithm in a single day as for the LS/LS+ algorithm, and if the candidate set size does not shrink to one on the day when both LS and LS+ have already provided their estimates, then the SG algorithm must make a uniformly random choice from the current candidate set as its source estimate. The simulation results are shown in Figure 6 for the DDE+HNM model. We do not implement the SG algorithm for the DCS+TU model, because its runtime is too high, and because it is not clear how it should be implemented for time-varying networks. Fig. 6 . The performance of the algorithms LS, LS+, R and SG if the metric is the probability of finding the source (solid curves) or the first symptomatic patient (dashed curves). The simulations were computed on a population of = 400 individuals in the DDE model on the HNM, and each datapoint is the average of 4800 independent realizations except for the SG algorithm, which was run with 192 independent realizations. The confidence intervals for the success probabilities are computed using the Wilson score interval method, and for the tests and the queries using the Student's t-distribution.

We show our simulation results comparing the random+DMP, SG, LS and LS+ algorithms in Figure 6 . In the first row of Figure 6 , we show the accuracy of the algorithms with solid curves. Since the LS/LS+ algorithms cannot detect an asymptomatic source, we also show what the accuracy would look like if the goal of the SDCTF was to detect the first symptomatic agent with dashed lines. It is clear that in both metrics and across a wide range of parameters, the LS+ algorithm performs best, followed by LS, next random+DMP, and finally SG. The only exception is for high values of , where SG performs best. The good performance of SG for these parameters is expected, because SG was originally developed for deterministically spreading epidemics (i.e., = 1).

In the second row of Figure 6 , we show the number of test/sensor queries used by the algorithms. LS uses the fewest tests, followed by LS+. The random+DMP and SG algorithms always use more tests than LS/LS+, except for large values of . Finally, in the last row of Figure 6 we show the number of contact (or in this case edge) queries used by the algorithms. Again, LS uses fewer queries than LS+, while both the random+DMP and SG algorithms query essentially the entire network. Figure 6 shows that the LS/LS+ algorithms are fairly robust to changes in the parameters of the model, except for the parameter . Indeed, if there are many asymptomatic nodes in the network, then source detection becomes very challenging. It may be surprising that as grows, the number of tests that LS uses decreases, contrary to LS+. This is because as grows, the LS algorithm gets stuck more rapidly, while the LS+ algorithm compensates for the presence of asymptomatic nodes by using more test/sensor queries. The analytic results from Section 4 are in good agreement with the simulation results in Figure  7 . We also experiment with changing the parameters ℎ , while keeping all the parameters fixed, and with changing while keeping the product fixed. We observe that LS is not affected by the parameter ℎ , whereas LS+ performs better with a higher , which is expected because LS+ leverages the household structure of the network to improve over LS. Somewhat surprisingly, we also observe that a higher also improves the performance of both algorithms. This can be explained by the fact that a larger implies that there are more nodes in the close neighborhood of the source, which results in shorter transmission paths, making source detection less challenging. Finally, if we increase but keep fixed, the performance of the algorithms does not change as much, which confirms the intuition that it is the number of infections caused by an infectious node in a single day that matters the most (as we discussed in Section 2). We show our simulation results on our most realistic DCS+TU model in Figure 8 . We make very similar observations on this model as the ones that we have made on the DDE+HNM model in Sections 5.2 and 5.3, which shows that the LS/LS+ algorithms and our analysis of their performance is robust to changes in the epidemic and network models.

In the DCS+TU model, we used a fixed limit on the number of sensors that the random+DMP model selects, instead of setting the limit based on the LS+ algorithm. As a result, for a few parameters the LS+ algorithm used more tests than the random+DMP model. However, we note that by updating the candidate node immediately after an earlier symptom onset time is revealed (see Section 3.5), we can essentially cut the number of required tests for the LS+ algorithm by half (LSv2 and LS+v2), without sacrificing the performance of the algorithms.

We have introduced the LS and LS+ algorithms in the SDCTF, and we have used a sequence of models on which we can compute their accuracy (probability of finding the correct source) rigorously. We find that both LS and LS+ outperform baseline algorithms, even though the baselines essentially query all contacts on a transmission path between agents, while LS and LS+ query only a small neighborhood of the source. One could argue that LS and LS+ beat the baseline algorithms only because we benchmark them in our own framework, which is different from the framework for which the baseline algorithms were developed. However, we argue that the LS/LS+ algorithms are robust to changes in the framework due to their simplicity, and we support our argument by simulation results. The runtimes of the LS/LS+ algorithms are also much lower than the baselines and do not depend on the network size since they are local algorithms -as opposed to the baselines, which have quadratic or even larger dependence on the network size. The "low-tech" approach in the design of the LS/LS+ algorithms increases their potential to be implemented in real-world scenarios, possibly even in a decentralized way, similarly to contact tracing smart phone applications [41] , which is an interesting direction for future work. , the middle household contains no symptomatic node (only the asymptomatic node 3 ), but the LS+ algorithm still succeeds. Indeed, at iteration 0 we set ,0 = 5 , after which we find that 3 is asymptomatic, and next that 2 is asymptomatic and 4 is symptomatic, with a lower symptom onset time then 5 . Hence, in iteration 1 we set ,1 = 4 , and we find that 3 , 2 are asymptomatic and 1 is is symptomatic, with a lower symptom onset time then 4 . Finally, in iteration 2 we set ,2 = 1 , and we find ′ = 1 = ,2 , which implies that the algorithm stops, and returns the correct source 1 . (b) An example for an epidemic where the LS+ algorithm would fail if we would update the candidate before the test queue becomes empty. Similarly to subfigure (a), in iteration 0 of the algorithm first learns about asymptomatic node 3 and next about asymptomatic node 2 and symptomatic node 4 . If the algorithm updates the candidate to 4 and continues further, instead of scheduling the tests of the household members of 2 , then it is not hard to check that 4 will be the final estimate and the algorithm fails. However, if the algorithm waits until the test queue becomes empty and tests the household members of 2 , then 1 becomes the next candidate and the algorithm succeeds.

We start by restating the lemma here for convenience. Lemma A.1. In the RB tree network, the LS algorithm succeeds if and only if all nodes on the transmission path are symptomatic, and the LS+ algorithm succeeds if among the nodes of the transmission path, there exists a symptomatic node in each household, and the source is symptomatic.

Proof. Throughout the proof we assume that there is no limitation on the number available tests. We can make this assumption because in the SDCTF there is only a daily limit on the number tests, there is no limitation on the number of days, and neither the LS nor the LS+ algorithms proceed in an iteration until the test queue becomes empty, which implies that all nodes that enter the test queue get eventually tested.

Suppose that the LS algorithm succeeds. Then the list of candidate nodes at different iterations forms a path that consists entirely of symptomatic nodes between the source and the first hospitalized node. In tree networks, the transmission path is the only path between the source and the first hospitalized node, which yields the "only if" part of the statement on the LS algorithm.

Next, suppose that all nodes on the transmission path are symptomatic. Then, we claim that the candidate node , computed in the ℎ iteration of the LS algorithm is − , the ℎ node of the reverse transmission path. Our claim is definitely true for = 0, because ,0 is initialized to be the first hospitalized node . Then, the proof proceeds by induction. By the induction hypothesis, in the ℎ step, , = − , and since we are on a tree, the symptom onset time of −( +1) (which is revealed because all nodes on the transmission path are symptomatic by assumption) is the only symptom onset time among the neighbors of , that have a lower symptom onset time than , itself. Therefore ′ = −( +1) , and , +1 is updated to be −( +1) in the beginning of the next iteration, which proves that the induction hypothesis holds until the source is reached.

Finally, suppose that among the nodes of the transmission path, there exists a symptomatic node in each household, and the source is symptomatic. Let us denote by the ℎ symptomatic node of the reverse transmission path. Then, we claim that the candidate list , computed in the ℎ iteration of the LS+ algorithm equals . Similarly to the case of the LS algorithm, the = 0 case holds by definition, and we proceed by induction. Suppose that , = . It will also be useful to define the index of on the forward transmission path (without skipping asymptomatic nodes). Let be this index, for which therefore = . Now we distinguish 3 cases: (i) −1 = +1 is symptomatic, (ii) −1 is asymptomatic and −2 = +1 is symptomatic, and (iii) −1 and −2 are asymptomatic and −3 = +1 is symptomatic. We claim that there are no more cases, and that in all three cases +1 is tested in the ℎ iteration of the LS+ algorithm. Case (i) is immediate because all neighbors of , are tested. Case (ii) is only possible if either −1 ∈ ( , ) or −2 ∈ ( −1 ), otherwise −1 would be a lone asymptomatic node in a household, which contradicts the assumption that there is a symptomatic node in each household. Since all the contacts of asymptomatic nodes in ( , ) (see Figure 4 (d)) and all nodes in the household of asymptomatic nodes are tested in the LS+ algorithm (see Figure 4 (e)), −2 must be tested too. Finally, case (iii) is possible only if −1 ∈ ( , ) and −3 ∈ ( −2 ) both hold, otherwise −1 or −2 would be a lone asymptomatic node in a household. Similarly to the previous case, −3 must be tested (see Figure 4 (f)). There are no more cases because, by Remark 4.2, on the RB tree a transmission path can only have two nodes in each household, and we assumed that there exists a symptomatic node in each household among the nodes of the transmission path.

After we proved that +1 is tested in the ℎ iteration of the LS+ algorithm, we must still show that it will be the next candidate , +1 for the induction hypothesis to hold. This is true because once the symptom onset time of +1 is revealed, none of its neighbors are scheduled for testing, and therefore all tested nodes have +1 on their path to the source, which means that +1 must have the lowest revealed symptom onset time, and therefore that it will be the next candidate , +1 . □

We are going to need prove a few intermediate results before proving Theorem 4.6. A first step is to count all the possible paths from the source with a given length.

Definition A.2. Let ( ) be the RB tree with parameters ( , ℎ ), and let be the source. A Red-Blue (RB) path of length is any path of nodes in ( = 0 , 1 , . .. ) such that ( , +1 ) ∈ ′ for 0 ≤ < . Let C be the set of RB paths of length . Lemma A.3. In the RB tree with parameters ( , ℎ ), | 0 | = 1, while for ≥ 1,

Proof. Let us keep track of the number of RB paths of length depending on the color of the last node in the path. Let and be the numbers of RB paths of length such that the last node is red and blue, respectively. A RB path of length 0 consists only of the source, which implies that 0 = 1 and 0 = 0. The source has red and ℎ blue neighbours, which implies that 1 = and 1 = ℎ . Suppose that is an RB path of length ≥ 2. If the last node of is red, then the node before the last node can be both blue or red. Red nodes other than the source have − 1 red children, while blue nodes have red children, yielding

If the last node of is blue, then the node before has to be red. Since every red node, including the source, has ℎ blue children, we have

By substituting equation (25) into equation (24), we obtain the recurrence

We solve this recurrence equation by calculating the characteristic equation

whose roots are

yielding the the general solution

where 1 , 2 are given by the initial conditions for = 0, 1

which are

From equations (24) and (25) we conclude that for ≥ 1,

and therefore

Inserting the values for 1 , 2 , 1 , 2 we obtain the desired result. □

Since LS+ improves on LS by making use of the household structure of the network, we need further information about the household structure of the transmission paths. Recall that by Remark 4.2, households on transmission paths on an RB tree were characterized either by a single red node (that is followed by a red node), or a pair of consecutive red and blue nodes. The following definition and lemma refine our previous result on counting the number of RB paths by taking the household structure into account.

Definition A.4. Let = { = 0 , 1 , . . . , = ℎ} be a RB path of length . We say that a node on the path is in a -single-household if no other node from is in the same household as . Otherwise, we say is in a -multi-household. Given a path , let : C → {0, 1} be the indicator function that the source is in a -multi-household. Similarly, let : C → {0, 1} be the indicator function that the last node of path is in a -multi-household. 

The set , , , depends on 4 parameters, but only some combinations of these parameters make it non-empty. The following definition will be useful in this regard. 

In all other cases | , , , | = 0.

Proof. Since there are + 1 nodes on path , with in -single households and thus + 1 − of them in -multi-households, we must have

2 households along path in total. Clearly, the numbers and cannot be of the same parity for any RB path , which is thus assumed for the rest of the proof (this assumption is also part of Condition 1).

If = 0, then the source is also the first hospitalized node, and it is in a -single-household, which implies that | 0,1,0,0 | = 1. If = 1, then there are two cases: either the source is in the same -multi-household with the first hospitalized node, or both of them are in -single-households. The former case is possible via ℎ edges from the source, which gives | 1,0,1,1 | = ℎ , while the latter case is possible via edges, and gives | 1,0,1,1 | = . Since these are the only possible RB paths of length ≤ 1, we must have | 0, , , | = | 1, , , | = 0 for any other choice of parameters , and .

Let us assume that ≥ 2. Then, the source and the first hospitalized node are not in the same household. Let us denote the household of the source by and the household of the first hospitalized node by ℎ . Note that (1 − ) and (1 − ) are the indicators of and ℎ being -singlehouseholds, and therefore ≥ (1 − ) + (1 − ). If this inequality (which is also part of Condition 1) does not hold, then clearly |C , , , | = 0. Similarly, the number of -multi-households is − +1 −2+ + ways. Once we know the color of each node along the path, the number of RB paths can be computed by multiplying the numbers of children with the appropriate color of each node. -single-households have no blue nodes, and -multi-households have exactly one, which implies that there are − +1 2 blue nodes. Since blue nodes are preceded by red nodes that have ℎ blue children, they give the multiplicative factor − +1 2 ℎ . Blue nodes, except from the first hospitalized node (if it is blue), have red children. So far we have accounted for all of the nodes in -multi-households and none of the nodes in -single-households. If the source is in a -single-household, then we must count its red children, whose number is . This implies that there exist − +1 2 − + (1 − ) nodes with red children. Finally, each -single-household, except and/or ℎ in case they are -single households, has − 1 red children. There are − (1 − ) − (1 − ) such -single-households, which gives the final term in equation (40) . □

The sets C , , , define equivalence classes on the transmission paths based on their household structure. In the next lemma we show that once we know which equivalence class we are in, it is possible compute the success probability of the LS+ algorithm.

Lemma A.6. Let be the transmission path in the DDE NR epidemic model with parameters ( , , ℎ ) on the RB tree with parameters ( , ℎ ), and let be as computed in (5) . Then, it holds that P( + succeeds| ∈ C 0,1,0,0 ) = 1 and P( + succeeds| ∈ C 1,0,1,1 ) = P( + succeeds|C 1,2,0,0 ) = 1 − .

Let , ∈ {0, 1}, let ≥ 2 and let ∈ N satisfy Condition 1. Then, it holds that P( + succeeds| ∈ C , , , ) ≥ (1 − )

In all other cases P( + succeeds| ∈ C , , , ) is not defined.

Proof. If = 0, then = 1 and = = 0. In that case, the source is the first hospitalized node and LS+ always succeeds. If = 1, then the first hospitalized node is in the neighbourhood of the source, and LS+ succeeds if and only if the source is symptomatic, which happens with probability 1 − .

Let us assume that ≥ 2 and that satisfies Condition 1 (otherwise |C , , , | = 0 and P( + succeeds| ∈ C , , , ) is not defined). By Lemma 4.3 the LS+ algorithm succeeds in the DDE NR model on the RB tree if, among the nodes of the transmission path, there exists a symptomatic node in each household, and the source is symptomatic, which means that we can prove a lower bound on the success probability of LS+. Let us assume that the source is indeed symptomatic. Since the first hospitalized node is symptomatic by definition, the households of the source and of the first hospitalized node cannot make the LS+ algorithm fail. Let us denote these two households by and ℎ , respectively. Also, let and be the sets of all -multi-and -single-households, respectively, excluding and ℎ . Then, LS+ succeeds if all nodes in the households of are symptomatic, and if at least one node in the households of is symptomatic, which has probability 1 − and 1 − 2 for each type of household, respectively, by equation (5). These observations yield that

□ Finally, we are ready to state and prove Theorem 4.6 on the success probability of LS+, which we restate here for convenience.

Theorem A.7. Let be as in (5) and let S( , , ) be the set of values that satisfy Condition 1. Then, for the DDE NR epidemic model with parameters ( , , ℎ ) on the RB tree with parameters ( , ℎ ) we have

where , 1 and 2 are terms depending on parameters and ℎ and are computed explicitly in Lemma A.3.

Proof 

Unlike P( + succeeds| ∈ , , , ), is defined for every 4-tuple of parameters ( , , , ) ∈ N × N × {0, 1} × {0, 1}. By the law of total probability we expand the success probability by conditioning on the path being of length as

( , , , )P( ∈ , , , )

( , , , )P( ∈ , , , | ∈ )P( ( , ℎ) = ).

Next, we exchange the sums over , and . This allows us to sum over only those values that satisfy Condition 1, which implies that P( + succeeds| ∈ , , , ) is well-defined. As in Lemma A.5, we need to treat the = 0 and = 1 cases separately. Continuing equation (45), we arrive to

Substituting in the results from Lemmas A.3, A.5 and A.6 into equation (46) gives the desired result. □

We start by restating Theorem 4.9 for convenience.

Theorem A.8. In the ( , )-RET with parameters , , ℎ , let , be as in Definition 4.8. Then

Proof. Similarly to [11, 25] , the proof relies on generating functions. We start by addressing the boundary cases. For all ≥ 0, it holds that ,0 = 1, and therefore ,0 = 1. Similarly, for all , such that > , it holds that , = 0, and therefore , = 0. Suppose that ≥ = 1. During day − 1, on the first level, there are −1,1 infected (internal) nodes and − −1,1 (external) nodes that may be infected with probability during day . Thus,

Taking the expectation of both sides in equation (50) yields

By subtracting the appropriate recurrence equations for ,1 and −1,1 for ≥ 2 we obtain the homogeneous recurrence equation 

and boundary conditions 0,1 = 0 and 1,1 = . We solve for ,1 using the same methods as in the proof of Lemma A.3 and obtain

Next, let us consider the general case ≥ > 1. On day − 1, there are −1, −1 nodes on level − 1. Since, each node on level − 1 has children, there are −1, −1 nodes on level that have an infectious parent on level − 1. However, −1, of them are already infected. Therefore −1, −1 − −1, nodes of level may be infected on day , each with probability , which implies

Taking the expectation of both sides in equation (54) yields

For convenience, let us introduce = 1 − and = , and also let

be the generating function for , with , ≥ 1. By multiplying (55) by and summing it over , ≥ 2 we obtain

(57)

and by inserting (58) into (57), we obtain

Now, we can also decompose the sum (58) using geometric series as

By plugging (60) into (59), we obtain the expression

Then, we expand the fractions in (61) into a power series and we next apply the binomial theorem, we arrive to

Let = 1 + + and = + 1. In order to obtain an expression for , , we must change the variables in the sums of equation (62) from ( , , ) to ( , , ). Changing the inner sum from variable to is simple. Changing the variables in the two outer sums is more challenging because

, and depend on each other in a nontrivial way. More precisely, since , ≥ 0 we have ≥ 1 and also ≤ − 1, which means that we have to set the lower limit of and the upper limit of accordingly. As for the remaining limits, variable can be arbitrary large, and can take any integer value starting from 0 independently of , which yields the expression

For the values of with ≥ + 1, the binomial coefficient −1 is 0, which implies that we can increase the upper limit of the inner sum from + 1 to in equation (63). Then,

Finally we can read off the value of , from equation (64) as

□

We start by restating Corollary 4.10 for convenience.

Corollary A.9. In the RET( , , ), let be the expectation of (12), as in Definition 4.8. For ≥ 0,

Proof. By using linearity of expectation, equation (12) and Theorem 4.9 we obtain:

Before we use binomial theorem, we need to swap the sums. Boundaries from (67) are equivalent to − 1 ≥ ≥ − 1 ≥ 0, so we can rewrite this as 2 conditions: + 1 ≥ ≥ 1 and ≥ ≥ 0.

Finally, by applying the binomial theorem and summing the geometric series, we obtain the desired equation:

□ A.5 Proof or Lemma 4.12

We restate Lemma 4.12 here for convenience.

Lemma A.10. Let us consider the stopped DET model with parameters ( , ), , ℎ , and let ℎ denote the first hospitalized node. Then

Proof. Recall that a node added at day is uniformly distributed among the − −1 > 0 nodes added that day, and that the number of nodes added to level is , − −1, on day . If we condition on the time of the first hospitalized case, denoted by ℎ , then 

The DMP equations were first derived by [22] for the SIR model in the context of source detection. Their goal is to compute the marginal probabilities that node is in a given state at time (denoted by ( ), ( ) and ( ) for the susceptible, infected and recovered states, respectively), given initial conditions ( 0 ), ( 0 ) and ( 0 ) at some initial time 0 . To solve this problem in tree networks, we may consider a dynamic programming approach, where we delete a node , we compute the marginal probabilities of ( − 1) for all neighbors of in the remaining subtrees, and use this information to compute ( ) (as the marginals are independent in each of the subtrees conditioned on the state of ). The DMP equations make the dynamic programming intuition explicit. Originally, the DMP equations were developed for static networks, but since the generalization to time-varying networks is straightforward, and has already been foreshadowed in a similar heuristic algorithm [16] , we include it in this preliminary section. For time-varying networks, we define ( ) as the set of neighbors of node in the time-window [ , + 1).

To formalize the dynamic programming approach, [22] introduces some new notation. Let be the probability that an infectious node infects a susceptible neighbor, and let be the probability that an infectious node recovers. Let be the auxiliary dynamics, where node receives infection signals, but ignores them, and thus remains in the state at all times. Let → ( ) be the probability that node is in the state at time in the dynamics , and let → ( ) be the probability that the infection signal has not been passed from node to node up to time in the dynamics . Finally, let → ( ) be the probability that the infection signal has not been passed from node to node up to time , and that node is in the state at time , in the dynamics . With these definitions, the dynamic programming approach is formalized by the following equations for ≥ 0 :

The marginal probabilities that node is in a given state at time are then given by

These equations are only exact on trees, but they can also be applied to networks with cycles as a heuristic approach. The heuristic gives good approximations to the true marginals if the network is at least locally tree-like [17] .

There are several differences between the SIR model on locally tree-like networks and the DDE+HNM model (see Figure 2 (a)). First, the DDE model has additional compartments (exposed nodes, asymptomatic nodes), which motivates the introduction of several new variables. Let ( ) (resp., ( ) ) be the probability that an asymptomatic (resp., symptomatic) node infects a susceptible node. Let → ( ) ( ) (resp., → ( ) ( ) ) be the probability that the infection signal has not been passed from node to node up to time , and that node is asymptomatic (resp., symptomatic) infectious at time , in the dynamics . The second important difference is that in the DDE model, the transition times between different compartments are deterministic instead of following a geometric distribution as in the standard SIR model. While deterministic transition times sound simpler at first, it turns out that they make the DMP equations more complex, because the Markovian property that each marginal probability depends only on the previous timestep is lost if the transition times are larger than 1. Recall that the times for the transitions → and → (with their default values) are = 3 and = 14.

Let us incorporate these two differences into equations (72)-(74) to derive the DMP equations for the DDE model. Equation (78) is essentially a copy of (72). Equation (79) follows equation (73), but we incorporate the two different variants of infected (asymptomatic and symptomatic) patients with their respective infection probabilities ( ) and ( ) . Equation (80) is a new equation, which is necessary because recovery times are no longer geometric random variables; instead we need to check the probabilities of infection + timesteps earlier than the current time . Finally, equation (81) (resp., (82)) is the asymptomatic (resp., symptomatic) version of equation (74) 

We note that for early values of , equations (80)-(82) depend on → before 0 , which we initialize to be 1 (all nodes are susceptible before the first node develops the infection). The marginal probability that node is susceptible at time is still computed by equation (75) as before. Equations (76)-(77) do not apply anymore; we explain it in Appendix B.4 how to take into account observations for nodes in the infectious compartments.

The third difference between the the SIR model on locally tree-like networks and the DDE+HNM model is that the HNM model contains many short cycles inside the households. Short cycles can cause unwanted feedback loops in the DMP equations where, loosely speaking, nodes are treated as if they could reinfect themselves. We solve this issue by modifying the underlying graph to be locally tree-like (only for the computation of the DMP equations). Specifically, we introduce a new central household-node for each household, and we replace the cliques inside the households by a star graph centered at this new household-node node. Introducing such a central household-node does of course alter epidemic process, in particular it makes household infections less independent and slower (all household infections need to pass through an extra node). To mitigate this issue, we assume that central household-nodes have = 1 and that they are infected with probability 1 by any node in the same household. We tested the validity of the resulting DMP equations against simulations of the epidemic progressions and we found the results to be quite accurate, in particular, more accurate than the version without the introduction of these central household-nodes.

Note that we derived the DMP equations for the DDE+HNM model, however, since (i) the compartments are the same, (ii) the equations support temporal networks, and (iii) we have separate infection probabilities ( ) and ( ) for asymptomatic and symptomatic nodes, our equations can also be applied to the DCS+TU model after a discretizing (rounding) the time observations. Finally, we touch upon the computational complexity of computing the DMP equations. In principle, we need to update ( ) equations (for each edge) over max timesteps, where max is the maximum time during which the marginals can still change, which can be as large as ( ). However, since we are only interested in computing the likelihood of the 5 earliest observations, max is typically quite low. Moreover, since we assume to be in an early stage of the epidemic, most of the equations remain unchanged. For better computational scalability, we only compute ( ) and → ( ) for nodes , that have → ( ) > 0.01, i.e., we only update nodes that are at least somewhat likely to have received the infection. Otherwise, we set → ( ) = → ( − 1), → ( ) = → ( − 1), and in the implementation we can perform these assignments implicitly using appropriate data structures. With these adjustments, the time-complexity of the algorithm becomes independent of , but remains dependent on the network parameters, the epidemic parameters and the number of sensors in a non-trivial way.

In this section we explain how we implemented the feasible source identification algorithm, which was suggested as a preprocessing step for a method very similar to the DMP equations by [16] . Let us define the directed graph 2 on (node,infecton_time) pairs (we use "nodes" for the nodes of the original graph and "pairs" for the nodes of 2 ), and draw an edge between two pairs ( 1 , 1 ) → ( 2 , 2 ) if 1 and 2 are in contact at 2 , and 2 is in the interval [ 1 + , 1 + + ]. Observe that in the DDE model there is an edge ( 1 , 1 ) → ( 2 , 2 ) if and only if 1 becoming infected at time 1 can infect 2 at time 2 . The definition of 2 is applicable to the DCS model as well after discretization (rounding), however, since the infection times are not deterministic anymore, not all possible infections ( 1 , 1 ) → ( 2 , 2 ) have a corresponding edge in 2 . Then, we perform a breadth-first search backwards on the directed edges of 2 , starting from each pair ( , − − ), where is a symptomatic sensor node, and is the symptom onset time of (for the DCS model, we start from integer times in the − − ± ( + ) interval to account for the randomness of the transition times). To limit the time complexity of the algorithm, we only consider the 1 earliest observations, which means that we start 1 breadth-first searches. With this construction, each pair ( , ) discovered by a breadth-first search started from ( , − − ) could have caused the infection in ; we say that ( , ) is an explanation for observation . We perform the breath-first searches until we find 2 pairs that explain all of the 1 earliest observations. See the pseudocode in Algorithm B.1. Proof. By construction, a source node that becomes infectious at time can cause an observation ( , ) if and only if there is a directed path from ( , ) to ( , − − ). Therefore, the breadth-first search algorithm finds all of the closest feasible sources in time. □

In this section we explain how to combine Algorithm B.1 with the DMP equations derived in Appendix B.2. See the pseudocode in Algorithm B.2. We start by computing the DMP equations (78)-(82) and (72) for the 2 tuples of node and time pairs that can explain the first 1 symptomatic observations returned by Algorithm B.1. Next, our goal is to use these DMP equations to compute the likelihood of each of the 2 tuples using the 1 observations. Similarly to [22] , we make the assumption that the first 1 observations are independent, and we can compute the likelihood by multiplying their respective marginals together. For symptomatic observed nodes , we know the time of symptom onset, which we denote by ( ). Then, the marginal probability of developing symptoms exactly at time can be computed by taking the difference of ( ( ) − − − 1) and ( ( ) − − ) and multiplying the difference by (1 − ). In Algorithm B.2 we drop the multiplicative factor (1 − ) because it is present for all of the tuples, and it does not change the final order of their scores. For asymptomatic (resp., negative) observations, we only know that at the time of testing, denoted by ( ) (resp., ( )), at least a time interval of length has passed (resp., has not passed) since the time of infection. Therefore, dropping the factor similarly to the symptomatic case, we compute the marginal of asymptomatic observations as 1 − ( ( ) − ), and we compute the marginal of negative observations as ( ( )− ). Finally, the contributions of the observations are multiplied together for each of the 2 tuples returned by Algorithm B.1, and the scores approximating the likelihoods are returned.

The patientzero problem with noisy observations

Threshold behaviour and final outcome of an epidemic on a random network with household structure

Optimizing and implementing contact tracing through behavioral economics. NEJM Catalyst Innovations in Care Delivery

Recurrence of distributional limits of finite planar graphs

Bidirectional contact tracing could dramatically improve COVID-19 control

Automated and partly automated contact tracing: a systematic review to inform the control of COVID-19. The Lancet Digital Health

Diffusion Source Identification on Networks with Statistical Confidence

Random trees: an interplay between combinatorics and probability

Implication of backward contact tracing in the presence of overdispersed transmission in COVID-19 outbreaks

Identifying Propagation Source in Temporal Networks Based on Label Propagation

Profile of random exponential binary trees

Fault-tolerant metric dimension of graphs

Localization of diffusion sources in complex networks with sparse observations

Source locating of spreading dynamics in temporal networks

The growth and composition of branching populations

Rumor source identification in social networks with time-varying topology

Message passing approach for general epidemic models

The effectiveness of backward contact tracing in networks

Impact of delays on effectiveness of contact tracing strategies for COVID-19: a modelling study

The power of adaptivity in source location on the path

Locating the source of diffusion in complex networks via Gaussian-based localization and deduction

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Quantifying the effects of contact tracing, testing, and containment measures in the presence of infection hotspots

Identification of source of rumors in social networks with incomplete information

Profile of Random Exponential Recursive Trees

On the robustness of the metric dimension to adding a single edge

Optimizing sensors placement in complex networks for localization of hidden signal source: A review

Fast and accurate detection of spread source in large complex networks

Locating the source of interacting signal in complex networks

Contact tracing during coronavirus disease outbreak

Locating the source of diffusion in large-scale networks

Detecting Sources of Computer Viruses in Networks: Theory and Experiment

Rumors in a network: Who's the culprit?

Source detection of rumor in social network-a review

Locating the source of diffusion in complex networks by time-reversal backward spreading

A General Framework for Sensor Placement in Source Localization

Back to the source: An online approach for sensor placement and source localization

The effect of transmission variance on observer placement for source-localization

How Many Sensors to Localize the Source? The Double Metric Dimension of Random Networks

Estimating infection sources in networks using partial timestamps

Models of random regular graphs

Identifying the diffusion source in complex networks with limited observers

Network observability and localization of the source of diffusion based on a subset of nodes

Sequential observer selection for source localization

Sequential source localization on graphs: A case study of cholera outbreak

Extending the metric dimension to graphs with missing edges

Locating the contagion source in networks with partial timestamps

Algorithm B.1: Feasible source identification (reverse dissemination [16] ) Input:• The mean exposed time the mean pre-infectious time , the mean infectious time , the std of the exposed time and the std of the pre-infectious time • ( ) and ( ) returns the minimum and maximum times when could have been exposed based on all of its (possibly asymptomatic or negative) test results • ( ) returns the time of symptom onset for a node tested positive symtomatic. return Algorithm B.2: Source detection via DMP Input:• The mean exposed time , the mean pre-infectious time , the mean infectious time • ( ) returns the time of symptom onset for a node tested positive symtomatic.• ( ) and ( ) return the time of asymptomatic and negative test results, respectively Output: A dictionary of 2 elements, which contains a score for each ( , ) pair that explains the first 1 observations. Higher scores signify higher confidence of being the source.