key: cord-0660742-rbuvpa7y authors: Hota, Ashish R.; Godbole, Jaydeep; Bhariya, Pradhuman; Par'e, Philip E title: A Closed-Loop Framework for Inference, Prediction and Control of SIR Epidemics on Networks date: 2020-06-23 journal: nan DOI: nan sha: aee5c41d60b7aa619f8ce86dcb2a6b5b284b0a6e doc_id: 660742 cord_uid: rbuvpa7y Motivated by the ongoing pandemic COVID-19, we propose a closed-loop framework that combines inference from testing data, learning the parameters of the dynamics and optimal resource allocation for controlling the spread of the susceptible-infected-recovered (SIR) epidemic on networks. Our framework incorporates several key factors present in testing data, such as high risk individuals are more likely to undergo testing and infected individuals potentially act as asymptomatic carriers of the disease. We then present two tractable optimization problems to evaluate the trade-off between controlling the growth-rate of the epidemic and the cost of non-pharmaceutical interventions (NPIs). Our results provide compelling insights for policy-makers, including the significance of early testing and the emergence of a second wave of infections if NPIs are prematurely withdrawn. Mathematical modeling of infectious diseases that spread through the human population has a long history. A plethora of models that capture the dynamic evolution of epidemics have been proposed (see [Hethcote, 2000 , Pastor-Satorras et al., 2015 for detailed surveys). However, there are two broad classes of models: one where a person who has recovered from the disease is immune from re-infection (at least for a few months or years) and second where there is a possibility of re-infection. One of the most fundamental mathematical models pertaining to the first class of epidemics is the susceptible-infected-recovered (SIR) epidemic where individuals can be in one of three possible compartments or states: susceptible, infected or recovered. Individuals who are susceptible get potentially infected by coming into contact with infected neighbors, while infected individuals recover at a certain rate. While several other models with additional compartments (such as death, quarantine, exposed, etc) have been proposed, the SIR epidemic remains one of the most fundamental models. Early work on (SIR) epidemics focused on the homogeneous population setting, while more recent works model the interactions between individuals via an underlying network. Detailed overviews of different epidemic models over networks are given in [Barrat et al., 2008 , Draief and Massouli, 2010 , Pastor-Satorras et al., 2015 , Nowzari et al., 2016 , Mei et al., 2017 . Since there is no strong evidence of re-infection in the ongoing COVID-19 pandemic, it may be assumed to fall under the first class of epidemics [Ota, 2020] . Consequently, we focus on estimation and control of SIR epidemics on networks in this work. Specifically, we • highlight the challenges associated with computing optimal NPIs to control the SIR epidemic on networks, • identify key characteristics inherent in testing data that have not been systematically captured by the existing work on inference and prediction of epidemics, • rigorously establish the behavior of the discrete-time SIR epidemic on networks, • propose a stochastic non-linear observer model that relates testing data with the underlying epidemic states, • propose a closed-loop framework that integrates inference, learning and optimal allocation of NPIs, and • uncover compelling insights on the behavior of the SIR epidemic under different NPIs via extensive empirical evaluations. Nevertheless, the proposed estimation and control approaches can naturally be extended to a large class of models with more compartments provided that there is no possibility of re-infection. In order to provide better clarity regarding the gaps in the existing literature and how our proposed framework fills those gaps, we first introduce the continuous-time SIR epidemic dynamics on networks [Mei et al., 2017] . Let G = (V, E) be a network or graph where V is the set of nodes with |V | = n and E ⊆ V × V is the set of edges. We consider a large-population regime where each node represents a sub-population (such as a city or a county/district or a state) as opposed to a single individual. We denote the size of sub-population i as N i . We denote by β ij ∈ R ≥0 the rate at which the infection can spread through the edge (v j , v i ) ∈ E, and by γ i ∈ R >0 the rate at which an individual in sub-population i recovers from the infection. If two nodes v i and v j are not neighbors, then β ij = 0. We assume that β ii = 0 for every node v i since the individuals inside sub-population i come in contact with each other. We define V i := {j ∈ V |β ij = 0} to be the set of neighbors of node v i . In the literature, the notation a ij is often used to denote the weight or contact pattern between nodes i and j and β i is used to denote the rate at which node i is infected by its infected neighbors. The formulations are equivalent if we define β ij = β i a ij . The proportions of the sub-population at node v i that are susceptible, infected and recovered at time t are denoted by s i (t), x i (t) and r i (t), respectively. Accordingly, we have s i (t), x i (t), r i (t) ∈ [0, 1] and s i (t) + x i (t) + r i (t) = 1 for all t and v i ∈ V . The deterministic continuous-time evolution of the SIR epidemic is given bẏ Note thatṡ i (t) +ẋ i (t) +ṙ i (t) = 0. The evolution of the proportion or fraction of infected nodes can be written in a compact form aṡ where x(t) ∈ R n , B is the matrix with (i, j)-th entry β ij and diag(γ) is a diagonal matrix with diagonal entries being the entries of the vector γ. As analyzed in [Mei et al., 2017] , the trajectory of the infected population depends critically on the largest eigenvalue of the matrix diag(s(t))B − diag(γ) , denoted by λ max (t). If λ max (0) > 0, then x(t) initially shows an exponential growth. Eventually, there exists at > 0 with λ max (t) < 0, and the fraction infected monotonically decreases for t ≥t. However, the proportion of the susceptible (recovered) population is monotonically decreasing (increasing) with t. Many infectious diseases, including COVID-19, show the above characteristic and such diseases, if unchecked, can potentially infect millions of people throughout the world in the span of several weeks. In the absence of appropriate medicine and vaccines at the early stages of an outbreak (as is the case for COVID-19) non-pharmaceutical intervention (NPI) strategies must be deployed. The primary NPIs to control the spread of such epidemics include • reducing the interaction between nodes (for example, by restricting travel and imposing lockdown measures), and • deploying more resources (in terms of healthcare personnel, dedicated hospitals and medical equipment) thereby increasing the rate at which infected individuals get cured. The former corresponds to reducing the β ij parameters and the latter corresponds to increasing the curing rates (γ i 's). Both of these interventions are costly; the former with a significant economic cost. Furthermore, different nodes have different degrees of epidemic outbreak, and require different degrees of interventions. Following early works in [Wan et al., 2007 , Preciado et al., 2014 , the existing literature on resource allocation for controlling the spread of epidemics has primarily focused the SIS epidemic on networks. In contrast with the SIR epidemic, under the SIS epidemic an individual after recovery can potentially be infected again if she/he comes in contact with other infected individuals. Nevertheless, for the SIS epidemic, there is a simple spectral condition which characterizes whether the disease persists in the population or the fraction of infected nodes/populations decays quickly to zero. It was shown in [Preciado et al., 2014] that the problem of optimal resource allocation to ensure that the disease is eradicated is an instance of a geometric program (GP) [Boyd et al., 2007] which can be solved efficiently. This approach has been extended in several directions, such as to account for dynamic networks Preciado, 2017, Ogura et al., 2019] , uncertainty in the network structure [Han et al., 2015] , distributed algorithms [Ramírez-Llanos and Martínez, 2018, Mai et al., 2018] , among others. However, the above optimization problems are solved off-line and do not use feedback to adapt the solution as the epidemic spreads. 1 In contrast, investigations of optimal resource allocation to contain the spread of the SIR epidemic have been few with [Ogura and Preciado, 2016] being an exception. It is shown in [Ogura and Preciado, 2016 ] that a GP can be formulated to minimize the expected cumulative number of people who get infected when λ max (0) < 0. However, this condition is restrictive since for most epidemics the fraction of infected individuals shows an initial exponential increase before declining. Optimal resource allocation for the SIR epidemic is particularly challenging because • there is no known tractable characterization of the eventual number of recovered individuals for the SIR epidemic which can be minimized in an off-line manner, 2 and • the instantaneous growth rate λ max (t) depends on the fraction of the susceptible population at each node which is time-varying and may not be accurately known. In this paper, we investigate a potential approach to minimize λ max (t) in an online manner. As mentioned above, this requires knowledge of the proportion of the susceptible sub-population at each node of the network. As a result, the current proportions of susceptible, infected, and recovered sub-populations at different nodes and the parameters that govern the dynamics of the epidemic (such as infection and curing rates) need to be learned from testing data (number of tests carried out, number of confirmed cases and number of recoveries) that is made available every day by different jurisdictions. Earlier work has primarily focused on learning the parameters of the epidemic dynamics. In particular, [Paré et al., 2020] presents a data-driven framework for learning the parameters of networked SIS epidemic dynamics. However, literature on SIR epidemic models have mostly focused on the scalar dynamics (without any network structure). In [Chen and Qiu, 2020] , a least squares based parameter identification was carried out assuming that the proportion of infected and recovered sub-population and the daily change in the above (i.e., the state information) is proportional to the respective fractions in the testing data. An analogous assumption was made in a recent work [Casella, 2020] on COVID-19 as well. In [Osthus et al., 2017 , Bayesian Markov Chain Monte Carlo (MCMC) techniques were used for estimating the states and parameters. An overview of the above approaches is discussed in [Abadie et al., 2020] . However, the above models do not consider the networked SIR epidemic dynamics, and do not capture the following characteristics inherent in testing data. • Real testing strategy is not uniform. Due to limited testing capacity, high-risk (symptomatic with travel history) individuals are more likely to get tested [Cohen and Kupferschmidt, 2020] . • For some diseases, such as COVID-19, individuals often show symptoms a few days after becoming infected (while they continue to infect others despite being asymptomatic). In light of the above research gaps, we propose a closed-loop framework that integrates inferring the state of the epidemic from testing data, learning the parameters of the epidemic dynamics and optimal resource allocation for the SIR epidemic on networks. The schematic of the proposed scheme is shown in Fig. 1 . In Section 2, we first introduce a discrete-time counterpart of the SIR epidemic dynamics stated in (1), and provide assumptions required for the model to be well defined. We then derive several properties of the state trajectories under the discrete-time dynamics, including expanding the idea of the reproduction number 3 to the networked SIR model and showing at least linear convergence to equilibria where no one is infected after some time. As discussed above, one of the key requirements towards controlling the evolution of the infected population is to infer the current fraction of susceptible population from testing data. We denote the testing data at node i for a given time k as Ω i (k) := (z i (k), c i (k), d i (k)), where z i (k) denotes Figure 1 : Schematic of the proposed framework. Here θ(k) = (s(k), x(k), r(k)) denotes the epidemic state at time k, Ω(k) denotes the testing data, θ(k) denotes the inferred epidemic states and τ denotes the delay factor as explained in Section 1.3. the number of tests carried out, c i (k) denotes the number of confirmed cases and d i (k) denotes the number of recoveries. While prior work has assumed that c i (k) z i (k) is proportional to the (change in) fraction of infected population (x i (k)) Qiu, 2020, Casella, 2020] , we incorporate the fact that individuals at a higher risk of being infected are more likely to undergo testing than healthy individuals. Furthermore, in order to capture the fact that for some diseases (such as COVID-19) an infected individual can remain asymptomatic for several days while being contagious, we assume that testing data is reflective of the epidemic state at a past time rather than the current time. We denote this lag or delay by τ . In practice, τ is different for each individual, and could be modeled as a random variable. However, as a first step, we assume τ to be deterministic and homogeneous across the population. In Section 3, we propose a stochastic nonlinear observer model that relates the testing data with the underlying states by incorporating the two factors discussed above. By leveraging the above model, we discuss how to infer the epidemic states The delay factor τ necessitates estimating the current state of the epidemic, θ(k) := ( s(k), x(k), r(k)), from the inferred values θ(k − τ ). If the parameters of the epidemic dynamics (β ij (κ)'s and γ i (κ)'s) are known for κ ∈ [k − τ, k], then the current state can be predicted from the dynamics. In the initial stages (before any interventions are deployed), these parameters may be learnt from the available data. Eventually, once the optimal contact and curing rates are deployed, they may be used for predicting the current state. We present a least-square formulation to identify the virus spread parameters in Section 4 in order to achieve this goal. In Section 5, we formulate two complementary GPs to evaluate the trade-off between minimizing the cost of NPIs and the growth-rate (analogous expression of λ max (t) for the discrete-time model). The optimal solutions are then deployed, and at the next iteration, new testing results are used to update the parameters and the predicted states, and the process repeats over time in a manner similar to that of receding horizon control. 4 Finally, in Section 6, we illustrate the performance of the proposed inference, prediction and NPI allocation in controlling the spread of the SIR epidemic on networks. Our results highlight the importance of early testing for accurate estimation and control performance, and the risk of a second wave of infection if NPIs are prematurely withdrawn. We now present the discrete-time dynamics, inference, parameter identification and optimization formulations in the following four sections, respectively. The continuous-time dynamics in (1) is often viewed as a mean-field approximation of a continuoustime Markov chain model of the evolution of the disease. However, since the (testing) data is often available in discrete points of time, we develop the proposed inference, learning and control tasks for a discrete-time version of the above dynamics motivated by a similar approach proposed in [Paré et al., 2020] for SIS epidemics. We now state the discrete-time SIR epidemic dynamics obtained via Euler discretization of (1). For a small enough sampling time h > 0, the deterministic discrete-time evolution of the SIR epidemic is approximately given by The initial conditions need to be specified such that In vector form the model becomes where 1 n is the vector of dimension n with all entries equal to 1. Note that we have defined A k := I n + hdiag(s k )B − hdiag(γ). We now assume the following on the parameters such that the dynamics is well behaved. Assumption 1. For all i ∈ [n], we have 0 < hγ i ≤ 1 and h n j=1 β ij < 1. Furthermore, the matrix B is irreducible. Note that Assumption 1 is satisfied when the sampling parameter h is chosen to be sufficiently small. If h is not sufficiently small (i.e., sampling is infrequent), then it is possible for the states to become negative or exceed 1, both of which are incompatible with their physical interpretation. The assumption also implies that A k is an irreducible non-negative matrix. Therefore, by the Perron-Frobenius Theorem for irreducible non-negative matrices [Varga, 2000, Theorem 2 .7 and Lemma 2.4], A k has a positive real eigenvalue equal to its spectral radius, which, we denote by λ A k max . We have the following result on the behavior of the discrete-time dynamics under the above assumption. These are similar to the behavior of the continuous-time dynamics [Mei et al., 2017] , but to the best of our knowledge, have not been formally proven in the literature. Our result also strengthens some of the observations in [Mei et al., 2017] . max is monotonically decreasing as a function of k, 5) there existsk such that λ A k max < 1 for all k ≥k, and 6) there existsk, such that x i (k) converges linearly to 0 for all k ≥k, i ∈ [n]. Further, if lim k→∞ s i (k) = 0 and hγ i = 1 for all i ∈ [n], there existsk, such that x i (k) converges superlinearly to 0 for all k ≥k. Proof. See Appendix A. Theorem 1 shows that the model in (4) is well-defined, the susceptible proportions and the growth rate decrease monotonically over time, the growth rate will eventually be less than 1, and the infected proportion of the population will go to 0, in at least linear time for large enough k. Remark 1. The largest eigenvalue λ A k max is a generalization of the reproduction number to the networked epidemic setting, that is, if λ A k max < 1 the virus quickly dies out. Our result rigorously proves the observation that for epidemics that follow an SIR-type dynamic, such as COVID-19, the reproduction number eventually falls below 1. We use these results as the framework for the control techniques presented in Section 5, which also require inferring the states from the testing data (Section 3) and estimating the spread parameters for forecasting the states (Section 4). As discussed earlier, part of the challenge in controlling the spread of infectious diseases such as COVID-19 is that the prevalence (i.e., the underlying state) of the disease in a given population is highly uncertain. Individuals need to be tested in order to determine if they are infected by the disease or pathogen under consideration. At early stages of the epidemic, many countries and regions do not have enough capacity to test a large number of people since their testing kits are limited. As a result, testing is conducted on individuals who show symptoms (such as fever and shortness of breath associated with COVID-19). However, drawing inference about the underlying spread of the epidemic from such testing data is not straightforward since (i) a large fraction of infected and contagious individuals never show any symptoms, and (ii) similar symptoms are also exhibited by patients who suffer from other related illnesses [Hu et al., 2020] . Furthermore, testing data on a given day reveals individuals who became infected a few days earlier (as opposed to information about the new infections on that day). In this section, we focus on inferring the epidemic state (s(k), x(k), r(k)) from testing data. Specifically, we develop a nonlinear observer model that relates the observations with the underlying epidemic states following a Bayesian approach. We then leverage the proposed model for state inference. We denote the inferred states with the symbol · . Recall that the testing data at time k at node v i is denoted by Ω i (k) and consists of the number of tests carried out (z i (k)), the number of confirmed (c i (k)) and removed (including both recoveries and deaths) (d i (k)) cases. We denote the cumulative number of confirmed and removed cases by C i (k) := k l=0 c i (l) and D i (k) := k l=0 d i (l), respectively. Accordingly, the number of known or active cases is given by A i (k) := C i (k) − D i (k) and the daily change in the number of known/active cases is given by a i (k) = c i (k) − d i (k). Note from the above discussion that testing enables us to observe the infection states of tested and confirmed individuals. In particular, the known active cases are analogous to the proportion of infected individuals. As a result, it is reasonable to treat the proportion of new infections at time k (i.e., c i (k) z i (k) ) as representative of the fraction of new infections at node v i . However, as discussed earlier, the confirmed cases at time k are often found to have caught the infection several days prior to being tested and with the delay denoted by τ ≥ 1. Therefore, we assume c i (k) z i (k) to be representative of the decrease in the proportion of susceptible individuals at time k − τ , i.e., the quantity −∆s i (k − τ ) = ∆x i (k − τ ) + ∆r i (k − τ ). In a departure from prior work which assumes c i (k) z i (k) to be proportional to x i (k) and/or −∆s i (k), we propose a Bayesian framework to model the fact that testing strategies are not uniform, and formally relate c i (k) z i (k) to ∆s i (k − τ ). We assume that a proportion h i (k) of the population at node v i belongs to a high risk category while l i (k) := 1 − h i (k) comes under a low risk category. Individuals may be deemed high risk if they exhibit symptoms associated with the disease, are known contacts of confirmed cases, have a travel history in affected regions, or any combinations thereof. Now, let H i (k) be a random variable with H i (k) = 1 (resp. H i (k) = 0) if a randomly chosen individual belongs to the high risk (resp. low risk) category. Accordingly, P(H i (k) = 1) = h i (k). We also define a {0, 1}-valued random variable DX i (k) with DX i (k) = 1 (resp. DX i (k) = 0) if a randomly chosen individual became infected at time k. We now introduce the following notation to denote certain conditional probabilities of interest. In particular, we define where p hx,i (k), p hh,i (k) ∈ [0, 1]. The parameter p hx,i (k) captures the proportion of high risk individuals (i.e., those who show symptoms) who became infected at k − τ and depends on the characteristics of the epidemic and the population (e.g., age, level of immunity against the disease, prevalence of comorbidity) at node v i . The parameter p hh,i (k) captures the proportion of healthy individuals who belong to the high risk category (i.e., they show similar symptoms or have travel history), but did not become infected τ time steps earlier. With the above notation in place, we now apply Bayes' law to compute the probability of a randomly chosen high risk individual being infected τ time steps earlier. Specifically, we compute Similarly, the probability of a randomly chosen low risk individual being infected τ time steps earlier is Recall that −∆s i (k) = s i (k − 1) − s i (k) = hs i (k − 1) n j=1 β ij x j (k − 1) ≥ 0. Furthermore, both p xh,i (k) and p xl,i (k) are monotonically increasing in −∆s i (k). We now relate the number of tests and confirmed positive cases with the underlying states. At time k, the authorities at node v i decide to carry out z h,i (k) ∈ [0, z i (k)] number of tests on high risk individuals and z i (k) − z h,i (k) number tests on low risk individuals. Assuming that the testing is accurate (with false positive and false negative rates being 0), 5 the confirmed cases among those tested is the sum of the number of confirmed cases among tested high risk individuals and among tested low risk individuals. Accordingly, we model where Bin(n, p) denotes the binomial distribution with parameters n and p. In control-theoretic terms, we model c i (k) as the output of a time-varying (due to its dependance on parameters p hx,i (k) and p hh,i (k)) stochastic nonlinear observer given by where ξ captures the randomness in the observation. We now relate the observed number of recoveries (d i (k)) with the underlying states in an analogous manner. Recall that under the SIR epidemic dynamics, the change in the proportion of recovered individuals, ∆r i (k) = r i (k) − r i (k − 1) = γ i x i (k − 1). In the observed data, the quantity analogous to the change in the fraction of recovered individuals is d i (k), which denotes the number of known removed cases among the known active cases A i (k − 1). Accordingly, we assume In other words, each known active case recovers with probability γ i . When the number of active cases is large, d i (k) is approximately equal to γ i A i (k − 1). The above analysis formally relates the observed quantities with the underlying epidemic states. If the parameters p hx,i (k) and p hh,i (k) and the number of confirmed cases among the tested high and low risk populations are known, then maximum likelihood inference of −∆s i (k−τ ) can be computed without much difficulty as both p xh,i (k) and p xl,i (k) are monotonically increasing in −∆s i (k − τ ). Nevertheless, in the rest of the section as well as in our numerical results, we focus on a practically motivated special case where only high risk individuals undergo testing, i.e., z h,i (k) = z i (k). Note that this has been the practice in many jurisdictions in the world during the ongoing COVID-19 pandemic [Cohen and Kupferschmidt, 2020] . Following (5), we have where α i (k) := p hx,i (k) p hh,i (k) . In other words, we only need to know ratio of the probability of an infected individual being high risk (or undergoing testing) and the probability of a healthy individual being high risk (or undergoing testing). The parameter α i (k) potentially depends on the number of tests carried out in a given day. Intuition suggests that as z i (k) increases, α i (k) should converge to 1, while for small values of z i (k), it should be very large. The parametric function α i (k) = 1+ F z i (k) with F being a large constant is a potential candidate that satisfies the above requirements. Another candidate is α i (k) = N i z i (k) where, recall that, N i is the size of sub-population i. Note that if α i (k) = 1, i.e., a healthy person is equally likely to undergo testing compared to an infected person, then we have − ∆s i (k − τ ) = c i (k) z i (k) . Given the testing data and assuming that we know α i (k), we now discuss how to infer the underlying states. In particular, for every node i ∈ [n], suppose that the daily testing data Ω i (k) is available to us over the time interval k ∈ [T 1 + τ, T 2 + τ ]. Accordingly, we define the inferred fraction of new infections at node v i as However, the above information is not enough to uniquely infer the states. Therefore, we first assume that for k < T 1 , x i (k) = r i (k) = 0 and s i (k) = 1. In addition, let ∆r i (k) = 0 for k ∈ [T 1 , T 1 + τ − 1]. Consequently, from (11), we have ∆x i (k) = − ∆s i (k) for k ∈ [T 1 , T 1 + τ − 1]. With the above information, we recursively define for k ∈ [T 1 , T 1 + τ − 1]. Now following (9), we infer the change in the fraction of recovered individuals as where x i (k − 1) is the inferred fraction of infected individuals that is computed recursively following (11) and (12). 6 In the pathological case with A i (k) = 0, we assume ∆r i (k) = 0. To summarize, employing (11), (12), and (13) together with the appropriate initialization, we can infer the underlying epidemic states if testing data is made available over an interval. Note that the inference can be inaccurate when the aforementioned assumptions on the initial conditions are not met. This is indeed the case when testing is carried out after the disease has already spread for a while (i.e., when T 1 is large). However, as illustrated in Section 6, simulations show that the inferred proportion of infected sub-populations eventually converge to the true values. Remark 2. The inference problem described above assume knowledge of the parameters α i (z)'s. If these parameters are not known, then one potential approach is to infer the states and solve the least square problem for different values of α i (z)'s; the one with the smaller cost is likely to be closer to the true underlying α i (z). In the following section, we describe how the inferred data can be used to learn the model parameters of the SIR epidemic and predict the current state of the epidemic. the current proportion of susceptible individuals using the inferred states up to k − τ . Prediction, or forecasting the future trajectory, of the epidemic is also of independent interest (beyond the optimal resource allocation problem). If the parameters of the epidemic dynamics are known, then those can be used to predict the current state of the epidemic from the inferred states up to k − τ . However, at the early stages of the epidemic, there is substantial uncertainty regarding the values of β ij and γ i parameters. In this section, we extend the least squares estimation technique proposed in [Chen and Qiu, 2020] for scalar SIR epidemics to learn the unknown β ij and γ i parameters in the case of networked SIR epidemics. Specifically, suppose at every node i ∈ [n], the daily testing data Ω i (k) is available to us over the time interval k ∈ [T 1 + τ, T 2 + τ ]. Following (11)-(13), we can infer the underlying states over the interval [T 1 , T 2 ] from testing data. Subsequently, we can learn the parameters {{γ i } i∈ [n] , {β ij } (i,j)∈E } ∈ R n+|E| of the epidemic dynamics by solving the least-square problem: If additional information such as bounds or the sparsity-pattern (specified by the network structure) of β ij parameters are available, those may be incorporated as constraints in (14). We denote the optimal solutions of (14) as γ i and β ij . The above formulation assumes that the parameters {{γ i } i∈[n] , {β ij } (i,j)∈E } ∈ R n+|E| do not significantly change over the interval [T 1 , T 2 ]. If the parameters change due to NPIs by the authorities, then the above formulation can be suitably modified to learn the epidemic parameters both before and after the imposition of NPIs by considering suitable sub-intervals during which different NPIs were in place. Other approaches such as [Chen and Qiu, 2020] define β ij 's to be parametric functions of NPIs. These settings can also be handled by suitably modifying the above formulation. Remark 3. The formulation in (14) admits a separable structure and as a result, each node v i can learn their respective β ij and γ i values using local information from their own testing data and obtaining the estimates x j (k) from their neighbors. The learned parameters of the epidemic dynamics from (14), γ i and β ij , can now be used to predict the current and future state of the epidemic using the inferred states as the initial conditions. In order to avoid introducing additional notation, we use · to also denote the predicted states. Specifically, for k ≥ T 2 , we compute the predicted states as where B is the matrix with (i, j)-th entry β ij and diag( γ) is a diagonal matrix with diagonal entries equal to γ i . In the following section, we formulate optimization problems to compute optimal NPIs to control the spread of the epidemic using the predicted state from (15). Recall from Theorem 1 that the growth rate of the infected fraction of the population at time k is given by λ A k max (the largest eigenvalue of the matrix I n + hdiag(s(k))B − hdiag(γ)). Since A k is a function of s(k), in the absence of perfect knowledge of s(k), we use the inferred/predicted value of s(k), denoted by s(k) (given in (15a)). The goal of the social planner is to control the spread of the epidemic by choosing the curing rates, i.e., the γ i parameters (which, for instance, correspond to deploying a larger number of healthcare personnel and/or medical equipment) and the contact rates, i.e., the β ij parameters (which correspond to imposing social distancing or lock-down measures). We present two geometric programming formulations to aid the social planner's decision-making with regards to NPIs in a rigorous manner. First, we consider the problem of minimizing the instantaneous growth rate λ A k max by optimally allocating the NPIs subject to budget constraints. Following analogous arguments in [Han et al., 2015 , Preciado et al., 2014 , this problem is equivalent to a GP given by: where the new variableγ i := 1 − hγ i is used so that the constraints in (16b) remain posynomials, the constraints in (16e) correspond to bounds on the contact and curing rates, the functions f and g are cost functions of NPIs, the constraints in (16c) and (16d) are budget constraints on NPIs with C 1 and C 2 being the budgets for contact rates and curing rates, respectively. The optimal value λ * k corresponds to the largest eigenvalue of A k with parameters γ * k and β * k ; the latter denote the optimal curing and contact rates, respectively. The optimal w * k is the eigenvector corresponding to λ * k . Furthermore, λ * k is the smallest possible value that can be achieved given the budget constraints. The above problem is solved repeatedly in an online manner. At time step k + 1, we again obtain s(k + 1) via feedback, and solve (16) with A k replaced by A k+1 . A problem complementary to (16) is to minimize the cost of NPIs subject to the constraint that the growth rate is bounded by λ k . This problem is given by: The above formulation corresponds to imposing NPIs in a cost-optimal manner while ensuring that the reproduction number stays below a certain threshold. Remark 4. Note that in Section 4 we estimated the virus spread parameters while in this section we allow a social planner to control the spread of the virus by setting these parameters. There are two ways to interpret this discrepancy: 1) before the social planner is able or decides to exert efforts to mitigate the spread of the virus, they must estimate the state of the system from testing data which, given the delay τ , requires estimating the spread parameters, and 2) even after the social planner is able and decides to implement preventative measures, the population may not follow the restrictions or they may be less effective (or more extreme) than needed; therefore estimating the actual parameters is necessary. We now evaluate the performance of the proposed framework in the following section via simulations. In this section, we evaluate the proposed approach on a relatively small network with 5 nodes with topology given in Fig. 2a . The nodes are roughly modeled after five countries in continental Europe: France (FR), Germany (DE), Italy (IT), Austria (AT) and Switzerland (CH). Two nodes are neighbors if they share a border. We choose the initially infected proportion to be 0.02 at node IT and 0 elsewhere. The contact and curing rates are chosen such that Assumption 1 is satisfied. The evolution of the infected sub-population at all the nodes is shown in Figure 2b . We now generate synthetic test data using the stochastic nonlinear observer proposed in Section 3. We assume that only the high risk (e.g., symptomatic) population undergoes testing, and daily number of tests z i (k) = 5000 is constant. We generate testing data starting from T 1 = 15 to T 2 = 200 with delay factor τ = 5 by sampling the binomial distribution as shown in (7) and (9). The daily and cumulative numbers of confirmed infections and recoveries as the daily active cases at node FR are shown in Figure 3 and Figure 4 for α FR (k) = 10 and α FR (k) = 1000, respectively. Recall that cFR(k) zFR(k) captures the proportion of new infections suitably adjusted by the risk factor α FR (k). Recall that α FR (k) is the ratio of the probability of an infected individuals being a high risk and the probability of a healthy individuals being a high risk. As a result, when we only test high risk individuals, the proportion of confirmed cases cFR(k) zFR(k) is much larger when α FR (k) is large, and vice versa. In addition, for k ≥ 180, the proportion of infected population begins to decline as shown in Figure 2b which is also reflected in the active cases in testing data. In Figure 5 we compare the true state trajectory with the inferred state trajectory at node IT for different parameter configurations described in the legend. The figures on the top row have T 1 = 25 (i.e., testing is relatively delayed) while all figures on the bottom row have T 1 = 10. The figures show that in the absence of early testing, we lack critical information about the initial epidemic states, and consequently, the inferred values deviate for quite some time from the true values. While the inferred infected proportion is eventually indistinguishable from the true proportion, the inferred recovered proportion remains below the true recovered proportion as the initial information was not available, and hence was not incorporated into the inference. The middle and right plots on the top row of Figure 5 differ in the value of α. The figures show that as long as α is correctly known and used, the inferred states closely track the true states. All the figures compare the inferred states and the delayed true states. Next the impact of the parameter estimation and state prediction will be highlighted in the context of optimal NPI computation. In order to isolate the performance of the online optimization approaches described in Section 5, we first assume that s(k) is exactly known, and solve the problems stated in (16) and (17). Following [Preciado et al., 2014] , we define the cost functions for NPIs as recall thatγ i = 1 − hγ i . We also set h = 0.1. Note that f ij is monotonically decreasing (i.e., it is costly reduce contact rates) and for β ij ∈ [β l,ij , β u,ij ], f ij (β ij ) ∈ [0, 1]. Thus, the range of the cost function is normalized to the interval [0, 1]. The function g i also has similar properties. The upper and lower limits on β ii and β ij are chosen differently to reflect the fact it is easier to reduce contact between individuals from different sub-populations compared to individuals within a sub-population. We first report the results obtained by solving (16) in an online manner for different budget combinations. We assume that the initial fraction of infected nodes is 10 −2 in DE, and 0 for all other nodes. Fig. 6 shows the evolution of the fraction infected and fraction recovered for node DE, and the instantaneous growth rate λ * k at the optimal NPIs. The figure shows that higher budgets lead to "flattening" of the curve of the infected population. The peak of the infected sub-population is smaller and occurs much later compared to the baseline setting without NPIs. Furthermore, the cumulative number of people who become infected shows significant reduction depending on the budget. When the budget is sufficiently high, not many people become infected in the first place. Consequently, the proportion of susceptible individuals remains high, which (perhaps counter-intuitively) results in a larger value of λ * k compared to the case with smaller budgets for NPIs. Nevertheless, the proportion of infected individuals decreases to 0. The evolution of the infected proportion at node FR and the optimal cost obtained by solving (17) are shown in Fig. 7 . We constrain λ A k max to be 1.02 for 0 ≤ k ≤ 100, followed by 0.95 for k > 100 until the withdrawal of NPIs at k = 200, 300 and 500 corresponding to early, intermediate and late withdrawals. When the constraint is 0.95, meeting the condition in Theorem 1 for linear convergence to zero, a significantly larger cost is borne by the social planner. Fig. 7 also shows that early and intermediate withdrawals lead to an exponential increase in the infected proportion, which occurs because, due to the initial interventions, the susceptible proportion is high and as a result, λ A k max is large, in the absence of NPIs. Therefore, the social planner incurs a heavy cost for maintaining the growth rate below 1. Unless the NPIs are maintained for a significantly long period to drive down the infected fraction to a negligible amount, there remains a serious risk of a subsequent flare-up. We are currently investigating the performance of NPIs when the true proportion of the susceptible sub-population is not known, but rather the inferred state information is used to solve the above optimization problems. Furthermore, inference and parameter estimation using the real testing data for the countries mentioned above are being carried out. Results pertaining to both the above factors will be added in a follow up version of this working paper. In this paper, we introduce a closed-loop framework to estimate and control the spread of the SIR epidemic on networks. We first rigorously establish the behavior of the discrete-time SIR epidemic dynamics; specifically that the dynamics is well-defined, the susceptible proportions and the growth rate decrease monotonically over time, the growth rate eventually falls below 1, and after some point the infected proportions of the population converge at least linearly to zero. We also generalize the reproduction number to the network-dependent setting. Furthermore, we incorporate several characteristics of real-world testing data in the state estimation task and examine the impacts of allocating NPIs (such as reducing contact rates and augmenting healthcare equipment and personnel) by solving suitable GPs in an online manner. Preliminary results reported here provide compelling insights on the behavior of the epidemic dynamics under NPIs. For instance, we show that (i) ignoring the fact that only individuals at a higher risk of being infected are being tested can have serious impact on inferring the true states and consequently on the optimal deployment of NPIs, and (ii) premature withdrawal of NPIs could lead to emergence of a second wave of infections. Nevertheless, this work presents a first attempt at using a closed-loop approach that integrates testing, through forecasting, all the way to control, as opposed to looking at control and forecasting separately. We now summarize several interesting open problems for the community to explore moving forward. • As mentioned before, we considered the networked SIR model given its strength as a fundamental and general model. For future work, we would like to expand these ideas to a richer class of epidemic models with potentially more compartments (such as death, quarantine, exposed, etc) such as the one proposed in [Giordano et al., 2020] . • Similarly, the classical SIR and SEIR models do not fully capture the fact that many individuals that become infected with COVID-19 never show any symptoms yet remain infectious. Therefore, a richer class of models with possibly more compartments and/or non-Markovian transitions should be developed that may better capture the behavior of the COVID-19 pandemic and other possible outbreaks in the future. • For the estimation part, in Section 4, we have focused on the least-square formulation due to its simplicity and universality. Other approaches such as Bayesian Markov Chain Monte Carlo [Osthus et al., 2017] may also be used in conjunction with the observation model given in (7) and (9). • Our numerical evaluations have relied on synthetic data. As was illustrated by the simulations, it is challenging to infer the states when the testing approach is not known and testing does not start at the beginning of the outbreak. Ongoing work is focused on inference and learning from real-data pertaining to COVID-19 pandemic using the proposed framework and providing insights into the NPIs imposed by various jurisdictions. • Given that the state inference is inaccurate when the deployment of testing is delayed, the impact of this inaccuracy on the optimal NPI techniques needs to be quantified explicitly in order to understand the importance of early testing. • Further, if we impose NPIs too soon with the hope of reducing the growth of infections, we may not have rich enough data to learn the epidemic parameters, which will hinder our forecasting abilities. Without understanding the spread of the disease, it is difficult to know when to withdraw NPIs. Therefore, studying the trade-off between exploration and exploitation is essential. • The success of NPIs depend on the effectiveness with which they are enforced. While the social planner computes the optimal contact rates leading to policy guidelines, individuals may or may not follow them based on their own perceived risk. Therefore, game-theoretic models of human decision-making 7 need to be integrated into the proposed closed-loop framework for improved prediction and control of the epidemic. Overall, the proposed framework establishes the mathematical foundations for computing optimal NPIs, highlights a number of interesting future research problems and will be a valuable tool for policy-makers. 3) Since the rate of change of s(k), −hdiag(s(k))Bx(k), is non-positive for all k ≥ 0 and s(k) is lower bounded by zero, by 1), we conclude that lim k→∞ s(k) exists. Therefore lim k→∞ −hdiag(s(k))Bx(k) = 0 n , where 0 n is the vector of dimension n with all entries equal to 0. Therefore, lim k→∞ x(k+1)−x(k) = lim k→∞ −hγx(k). Thus, by the assumption that hγ i > 0 for all i ∈ [n], lim k→∞ x i (k) = 0 for i ∈ [n]. 4) Recall that by Assumption 1, A k is an irreducible non-negative matrix and thus by the Perron-Frobenius Theorem for irreducible non-negative matrices [Varga, 2000, Theorem 2.7 and Lemma 2.4] , λ A k max = ρ(A k ), where ρ(·) indicates the spectral radius. By [Varga, 2000, Theorem 2 .7], ρ(A k ) increases when any entry of A k increases. Therefore, by 2) and since A k is defined as I n + hdiag(s k )B − hdiag(γ), we have that which contradicts that lim k→∞ x(k) = 0 n . Therefore, there exists ak such that λ A k max < 1 for all k ≥k. 6) Since, by 5), there exists ak such that λ A k max < 1 for all k ≥k, and we know that λ A k max = ρ(A k ) ≥ 0 by Assumption 1, we have lim k→∞ x(k + 1) x Therefore, for k ≥k, x(k) converges linearly to 0 n . Further, if lim k→∞ s i (k) = 0 and hγ i = 1 for all i ∈ [n], there existsk, such that λ A k max = 0. Therefore, by (22), x i (k) converges superlinearly to 0 for all k ≥k. Epidemic modeling and estimation, 2020. Working Paper, MIT Dynamical processes on complex networks Can the COVID-19 epidemic be managed on the basis of daily data? Scenario analysis of non-pharmaceutical interventions on global COVID-19 transmissions Countries test tactics in war against COVID-19 Epidemics and rumours in complex networks Optimal patching in clustered malware epidemics Modelling the COVID-19 epidemic and implementation of population-wide interventions in italy Data-driven network resource allocation for controlling spreading processes The mathematics of infectious diseases Game-theoretic vaccination against networked SIS epidemics and impacts of human decision-making Clinical characteristics of 24 asymptomatic infections with COVID-19 screened among close contacts in nanjing, china Distributed algorithm for suppressing epidemic spread in networks On the dynamics of deterministic epidemic propagation over networks A note on the derivation of epidemic final sizes Analysis and control of epidemics: A survey of spreading processes on complex networks Efficient containment of exact SIR Markovian processes on networks Optimal containment of epidemics in temporal and adaptive networks Optimal containment of epidemics over temporal activitydriven networks Forecasting seasonal influenza with a state-space SIR model Will we see protection or reinfection in COVID-19? Analysis, estimation, and validation of discrete-time epidemic processes Epidemic processes in complex networks Optimal resource allocation for network protection against spreading processes Distributed discrete-time optimization algorithms with applications to resource allocation in epidemics control An epidemiological forecast model and software assessing interventions on COVID-19 epidemic in china Matrix Iterative Analysis Network design problems for controlling virus spread Designing spatially heterogeneous strategies for control of virus spread Robust economic model predictive control of continuoustime epidemic processes Stability analysis and optimal vaccination of an SIR epidemic model The authors thank Prof. Shreyas Sundaram (Purdue University) and Damir Vrabac (Stanford University) for helpful discussions that contributed to this work. The authors thank Sanket Kumar Singh (IIT Kharagpur) for his contributions in data analysis and visualizations. A Appendix: Proof of Theorem 1.Proof. We present the proof for each part of the theorem, starting with 1). 1) We prove this result by induction. By assumption s i (0), x i (0), r i (0) ∈ [0, 1] and s i (0)+x i (0)+ r i (0) = 1 for all i ∈ [n]. Therefore, by Assumption 1 and (3a),, by (3b) and Assumption 1,By (3c) and the non-negativity of h and γ i , and since x i (0) ≥ 0, we have r i (1) ≥ r i (0) ≥ 0. By (3b) and Assumption 1, we have r i (1) ≤ r i (0) + x i (0) ≤ 1. Adding up (3a)-(3c), gives that s i (1) + x i (1) + r i (1) = s i (0) + x i (0) + r i (0), which by assumption equals 1. Now assume for an arbitrary k, s i (k), x i (k), r i (k) ∈ [0, 1] and s i (k)+x i (k)+r i (k) = 1. Following the exact same arguments as for the base case except replacing 0 with k and 1 with k + 1, it can be shown that s i (k + 1), x i (k + 1), r i (k + 1) ∈ [0, 1] and s i (k + 1) + x i (k + 1) + r i (k + 1) = 1. Therefore, by induction, s i (k), x i (k), r i (k) ∈ [0, 1] and s i (k) + x i (k) + r i (k) = 1 for all k ≥ 0 and i ∈ [n].2) By 1) and the non-negativity of the β ij 's, it follows that h − s i (k) n j=1 β ij x j (k) ≤ 0 for all k ≥ 0. Therefore, from (3a), we have s i (k + 1) ≤ s i (k). A k+1 max . 5) There are two possible types of equilibria for the SIR model: i) lim k→∞ s(k) = 0 n , or ii) lim k→∞ s(k) = s * = 0 n . We explore the two cases separately.i) If lim k→∞ s(k) = 0 n , then the rate of change of x(k) converges to −hγx(k). Therefore, by the definition of λ A k max , there exists ak such that λ A k max < 1 for all k ≥k. ii) If lim k→∞ s(k) = s * = 0 n , then, by 3), for any (s(0), x(0), r(0)) the system dynamics converge to some equilibria of the form (s * , 0 n , 1 n − s * ), for some nonzero s * . DefineBy 2) and 1), respectively, we know that s (k) ≥ 0 and x (k) ≥ 0 for all k ≥ 0. Furthermore, we know that s (k + 1) ≤ s (k) for all k ≥ 0, lim k→∞ s (k) = 0 n , and lim k→∞ x (k) = 0 n . Linearizing the dynamics of s (k) and x (k) around (s * , 0 n ) givesLet λ A * max be the maximum eigenvalue of I n +hdiag(s * )B−hdiag(γ) with corresponding normalized left eigenvector w * , that is,If λ A * max > 1, then the system in (20) is unstable. Therefore, by Lyapunov's Indirect Method, lim k→∞ ( s (k), x (k)) = (s * , 0 n ), which is a contradiction. Now consider the case where λ A * max = 1. Left multiplying the equation for x(k + 1) in (4b) by w * and using (19) and (21) gives w *x (k + 1) = w * [I n + hdiag( s (k) + s * )B − hdiag(γ)] x (k) = λ A * max w * x (k) + w * hdiag( s (k))B x (k) = w * x (k) + w * hdiag( s (k))B x (k),where the last equality holds since λ A * max = 1. Thus, w * ( x (k + 1) − x (k)) = w * hdiag( s (k))B x (k) ≥ 0,