key: cord-0218879-mosiakop authors: Cianfanelli, Leonardo; Parise, Francesca; Acemoglu, Daron; Como, Giacomo; Ozdaglar, Asuman title: Lockdown interventions in SIR model: Is the reproduction number the right control variable? date: 2021-12-13 journal: nan DOI: nan sha: e1189998dcfd31e020fdb23aeab5045dd86699bb doc_id: 218879 cord_uid: mosiakop The recent COVID-19 pandemic highlighted the need of non-pharmaceutical interventions in the first stages of a pandemic. Among these, lockdown policies proved unavoidable yet extremely costly from an economic perspective. To better understand the tradeoffs between economic and epidemic costs of lockdown interventions, we here focus on a simple SIR epidemic model and study lockdowns as solutions to an optimal control problem. We first show numerically that the optimal lockdown policy exhibits a phase transition from suppression to mitigation as the time horizon grows, i.e., if the horizon is short the optimal strategy is to impose severe lockdown to avoid diffusion of the infection, whereas if the horizon is long the optimal control steers the system to herd immunity to reduce economic loss. We then consider two alternative policies, motivated by government responses to the COVID-19 pandemic, where lockdown levels are selected to either stabilize the reproduction number (i.e.,"flatten the curve") or the fraction of infected (i.e., containing the number of hospitalizations). We compute analytically the performance of these two feedback policies and compare them to the optimal control. Interestingly, we show that in the limit of infinite horizon stabilizing the number of infected is preferable to controlling the reproduction number, and in fact yields close to optimal performance. The COVID-19 pandemic has revealed the need for nonpharmaceutical interventions such as lockdowns to mitigate the spread of novel diseases, when cures and vaccines are not available. Unfortunately, while very effective, such interventions are very costly from an economic perspective. A wide variety of strategies were adopted all over the world, implementing different tradeoffs between economic and epidemic cost. In general, this class of epidemic control strategies can be classified in two main categories: suppression strategies, which "aim to reverse epidemic growth, reducing case numbers to low levels and maintaining that situation indefinitely", or mitigation strategies, which "focus on slowing but not necessarily stopping epidemic spread, reducing peak healthcare demand" while minimizing the negative effects on economic activities [1] . We here aim at understanding better the trade-off in terms of economic and epidemic cost of lockdown strategies on epidemic spread within a standard susceptible-infectedrecovered (SIR) model [2] . We study the optimal lockdown from an optimal control perspective as first analyzed in [3] , with an objective that takes into account the epidemic cost associated with deaths and the economic cost due to lockdowns, as suggested in [4] , [5] . The cost takes into account hospital congestion, which appeared to be a key feature of the recent pandemic, by assuming that the lethality due to the disease is an increasing affine function of the fraction of infected i, leading to a quadratic mortality rate in i. To the best of our knowledge, no analytical solutions to optimal control problems with quadratic cost in i are provided in the literature [6] . While in [4] , [5] numerical solutions are provided, in [7] authors show that if the cost is linear in the infected, i.e., there is no hospital congestion, and convex in the intervention, the optimal policy is quasi-convex. In [8] the authors consider the problem of minimizing the cost of the intervention subject to the constraint that the fraction of infected is below a certain threshold, i.e., i(t) ≤ i for every time t, where the value of i is based on hospitals capacity. For the case of infinite time horizon T , the authors prove analytically that in a first phase it is optimal to let the epidemic spread, until i(t) = i; then the optimal lockdown is such that the infected remain constant at the threshold, and is released when herd immunity is reached, i.e., when the fraction of people that contracted the disease is large enough to guarantee that, from there on, the number of infected people will decrease even without lockdown. While [8] handle hospital congestion by a hard constraint on the simultaneous fraction of infected, without any cost on the number of deaths, our formulation considers a quadratic cost in i without hard constraints. A feedback control with a hard constraint of the simultaneous fraction of infected is also proposed in [9] . Optimal control approaches to containment of COVID-19 pandemic using more detailed compartmental models are also studied numerically in [10] , [11] . The contribution of our work is two-fold. We first show numerically that as T increases and the trade-off between economic and epidemic cost vary, the optimal control captures the fundamental difference between suppression and mitigation, exhibiting a phase transition between solutions that avoid diffusion of the disease, and solutions that steer the system to herd immunity while maintaining the number of infected low to mitigate the negative effects of hospital congestion. Our second main contribution is to consider two simple feedback policies, where lockdown levels are selected to either stabilize the reproduction number R(t) (i.e., reduce the number of secondary infections to "flatten the curve") or the fraction of infected (i.e., containing the number of hospitalizations). We note that the policy stabilizing i(t) resembles the optimal control in [8] , with the only difference that in [8] the threshold depends on hospitals capacity, while in our formulation the threshold is a free parameter that can be optimized to reduce the cost. We analytically compute the exact cost of these policies, and provide an upper bound to the performance gap between the two policies and the optimal control. In particular, we show that if T is infinite, controlling i(t) is better than controlling R(t) and is close to optimal. Together with its close-to-optimal performance, the policy that controls the number of infected is in feedback form and easy to interpret, in contrast with numerical solutions of optimal control. As a corollary, our result shows that the spectral control approach (see e.g., [6] , [12] ), where typically one aims at selecting the minimal intervention guaranteeing R(t) < 1 is not optimal in the long run when i) the cost of the intervention has to be paid repeatedly in time (e.g., lockdown are different than vaccinations since if lockdown are interrupted the reproduction number will immediately go back to above 1, while after a vaccination campaign, assuming permanent immunization, the system will be stable forever on) and ii) the horizon is long or the economic cost associated with the intervention is comparable with the epidemic cost. We remark that our theoretical analysis is restricted to the case of infinite time horizon and does not consider pharmaceutical interventions that may be implemented when cures and vaccines become available. Furthermore, we assume that individuals who contracted the infection become immune from there one. We emphasize that our purpose is not to suggest specific control strategies for the COVID-19 pandemic, but to provide general insights and mathematical contribution on the control of epidemics. To capture the effect of lockdown interventions, we consider the following simple SIR model introduced in [4] : where s/i/r is the fraction of susceptible/infected/recovered agents, with initial condition s 0 /i 0 /r 0 , β is the infection rate, γ is the recovery rate, L ∈ [0, L] is the percentage of population in lockdown and θ is the effectiveness of the lockdown (i.e. even if L = 1 only θ percent of the population actually complies with the lockdown) and the dependence of s, i, r, L on t is omitted for convenience of notation. While the transmission mechanisms of the SARS-CoV-2 are very complicated (see e.g. [13] ), we assume that the lockdown L enters quadratically in the dynamics, because people get the disease through pairwise interactions, (see [4] , [5] ). When considering lockdown policies, one needs to consider not only the epidemics dynamics, but also the economic cost associated with such interventions. To this end, the cost of the lockdown policy can be divided into an economic cost (due to loss of workforce because of the lockdown) which we model as cost eco (t) = L(t), and an epidemic cost (related to number of fatalities) which following [4] and [5] we model as cost epi (t) = κγ d (i(t))i(t), where γ d (i(t))i(t) is the mortality rate and κ is the trade-off between economic and epidemic cost. We assume that the mortality γ d (i) = γ 0 + γ 1 i is increasing in i to take into account hospitals congestion, as proposed in [4] , [5] . The behaviour of the SIR dynamics is characterized by the reproduction number R(t) := β s(t)/γ, which is the number of secondary infections that an infected person produces on average. If R(t) < 1, the epidemic is stable in the sense that the number of infected decreases indefinitely, while if R(t) > 1 the number of infected increases until the first time t * such that R(t * ) = 1. In presence of control, i.e., when L(t) > 0, the infection rate and therefore the reproduction number are reduced. We let R L (t) := β (1 − θ L) 2 /γ denote the controlled reproduction number and R(t) denote the uncontrolled reproduction number. Let c(t) := cost eco (t) + cost epi (t). The objective of the planner is to find the optimal control L * (t) solving where T is the time horizon, We start by noting that, using similar techniques as in [3] , one can show that the optimal lockdown L * (t) solution to the optimal control problem (2) is an increasing function of the force of infection f := β si. It is remarkable to note that, although the stability of the system does not depend on i, the lockdown policy does, through f . Thus, even if the system is highly unstable (i.e., R(t) = β s(t)/γ 1) if i(t) → 0 then L * (t) → 0, which means that the optimal lockdown lets the infected grow when the number of infected is low. Besides this characterization, an analytic solution to (2) remains open. Numerical solutions to optimal control problems such as (2) have been investigated, but typically suffer from lack of interpretability and robustness guarantees, as they are not in feedback form, i.e., the policy does not depend explicitly on the state. As an alternative, in the next section we propose two simple feedback policies based on controlling the reproduction number (which is typically considered in spectral control [6] ) or controlling the total number of infected (to reduce congestion of the health system). These strategies, or their combination, have informed the response to COVID-19 of governments and public/private entities (e.g., Cornell University We here describe the two feedback policies. Their performance are studied in Section IV-B. To simplify the exposition we set θ = 1 and L = 1, but all results can be generalized. Phase I (L ρ (t) > 0) lasts until β s/γ = ρ (red line); then, the trajectory becomes uncontrolled (Phase II). Infected grows until the reproduction number becomes less than 1. The unit of time is one day. Right: optimization over ρ. The parameters used in this simulation, and are β = 0.2, γ = 1/18, γ 0 = 5.6 · 10 −4 , γ 1 = 5.6 · 10 −3 based on the COVID-19 pandemic, see [4] , [5] ), s 0 = 0.98, i 0 = 0.01, r 0 = 0.01, κ = 40 · 365, T = 5 years. For the used parameters ρ * > 1, meaning that the total cost is minimized with a mitigation policy that lets the infection spread in a controlled way. This policy aims at controlling the reproduction number, as typically considered in spectral control (see e.g. [6] ). Given a target reproduction number ρ, if the uncontrolled reproduction number We can divide the dynamics in two phases: 1) Phase I: in [0,t * ) with t * such that s(t * ) = γρ/β . In this phase β s/γ ≥ ρ, L ρ is as in (3) and the dynamics areṡ 2) Phase II: in [t * , T ]. Since R(t) ≤ ρ, L ρ = 0, the dynamics is uncontrolled and given bẏ Phase I may not exist if β s 0 /γ ≤ ρ, and Phase II may not exist if β s(T )/γ > ρ (this scenario typically occurs when ρ < 1 because the infected decay exponentially and the disease does not spread). Policy A is parametric in ρ, thus a policymaker that adopts this policy should select ρ * to minimize the total cost (see Fig. 1 ), leading to As anticipated in Section II, optimality conditions suggest that the optimal quantity to be controlled is the force of infection [3] . Stabilizing this quantity, i.e., imposing lockdown which stabilizes asymptotically i(t) to φ /γ. Based on this observation, instead of controlling the force of infection, we here propose a policy that stabilizes the number of infected i(t) (as the optimal policy in [8] ), whose meaning is more intuitive. Given a target value ι, policy B works as follows: if i 0 < ι, L ι (t) = 0 until the first time τ 1 such that i(τ 1 ) = ι; if i 0 > ι, L ι (t) = 1 until τ 1 ; then, impose L ι (t) in such a way thati(t) = 0, hence i(t) = ι, until time τ 2 such that herd immunity is reached, i.e., s(τ 2 ) = γ/β ; finally, the lockdown is released, i.e., L ι (t) = 0 (see Fig. 2 ). Given ι, we divide the trajectory in three phases: 1) Phase I in [0, τ 1 ), with τ 1 being the smallest time such that i(τ 1 ) = ι. If i 0 < ι L ι (t) = 0 and the dynamics iṡ 2) Phase II in [τ 1 , τ 2 ), with τ 2 such that s(τ 2 ) = γ/β . In this phase i(t) = ι, and the dynamics iṡ 3) Phase III in [τ 2 , T ]: the dynamics is uncontrolled and given byṡ Notice that in Phase IIIi(t) < 0 even without control because herd immunity is reached. In contrast with [8] , where the value of ι depends on the ICU capacity and the policymaker has to choose the optimal policy complying with the constraint i(t) ≤ ι, here for any value of ι the policy is given, and the policy-maker should select ι * to minimize the total cost (2) (see Fig. 2 ), leading to This section is divided in two parts. In the first part, some observations on numerical solutions are presented. In the second part, we provide an upper bound on the performance gap between the feedback policies presented in Section III and the optimal control (2), and show that, some under assumptions, controlling the fraction of infected is close to optimal and outperforms the policy based on controlling the reproduction number. Our first result is that numerical solutions of the optimal control exhibit a phase transition from suppression to mitigation, as described in [1] , as T increases and κ decreases. In the left panel of Fig. 3 , economic and epidemic cost as their trade-off κ vary are computed, while other parameters (including T ) are fixed. There is a clear phase transition: when κ is small, the optimal policy lets the epidemic spread (high number of deaths, small economic cost), whereas when κ is large optimal policy maintains the lockdown until the horizon (high economic cost and small epidemic cost). Solutions in between these two strategies are never optimal. The right plot shows that a similar phase transition occurs with κ constant and varying T . Despite the economic and epidemic cost exhibiting a phase transition, we emphasize that the total cost is continuous in the parameters. Remark 1: We also observe that similar phase transition occurs for Policies A and B, namely if κ is large and T is small, the optimal reproduction number for Policy A satisfies ρ * < 1, implying that i(t) decreases, and under Policy B the optimal ι * is very small, whereas, if κ is small and T is large, the optimal reproduction number satisfies ρ * > 1, the optimal ι * is much larger, and the epidemic cost is in general comparable with the economic cost. This confirms our intuition that solutions based on spectral control problems, that minimize statically the cost of intervention subject to the constraint R 0 < 1, are suboptimal from an optimal control perspective when T is large or Fig. 4 . Left: T = 1yr, optimal control (orange) and feedback policies (policy A in blue, policy B in purple) select suppression strategy. Performances of the three strategies are comparable, and economic cost is much larger than epidemic cost. Right: T = 3yrs, optimal control and feedback policies select mitigation. The dynamics under policy B is similar to dynamics under the optimal control, and Policy B outperforms Policy A. In this simulation κ = 1.5, and the other parameters are as in Fig. 1. κ is not too high, since stabilizing the system indefinitely leads to a prohibitively high economic cost. As shown in Fig. 4 , in the range of parameters that make the optimal control select suppression strategies, the optimal control and the feedback policies achieve similar performances, whereas when the optimal control selects mitigation, Policy B is close to optimal and outperforms Policy A. We next investigate these observations. To compare the cost achieved by the feedback policies with the optimal cost, we work under the assumption that T = +∞, which has two main implications: 1) there exists t * such that s(t * ) = γ/β under optimal control, that is, herd immunity is reached; indeed, if it were not the case, the lockdown could never be totally released and the economic cost would explode; 2) the fraction of infected i(T ) vanishes (see e.g. [17] ). In other words, with infinite T there is no phase transition between suppression and mitigation (unless κ = 0), since avoiding the spread of the infection leads to an infinite economic cost and is therefore unfeasible. Remark 2: We observed from simulations that above a certain value of T , the optimal policy does not depend on T . Indeed, if the epidemic reaches herd immunity, the infected naturally asymptotically decrease to 0 and both epidemic and economic cost tend to vanish. Hence our results provide where s(τ 1 ) and r(∞) are the unique feasible solutions of insight not only for the infinite horizon case, but for any large T . Before establishing our main results, we introduce a technical lemma on SIR dynamics which is needed for the proofs and may be of separate interest. Lemma 1 (Uncontrolled dynamics): Let a ≤ b, and let γ . Proof: It can be shown that By (6) The statement follows by plugging (8) and (9) in (7). Let C A (ρ) and C B (ι) denote the cost of policies A and B for a given value of ρ and ι, respectively. In the next two propositions we derive analytical expressions for C A (ρ) and C B (ι). For the sake of readability, for the two policies we work respectively under the assumption that ι ≥ i 0 and ρ > 1, but similar expressions hold in case ι < i 0 and ρ ≤ 1. Proposition 1: Let T = +∞. Then, C A (ρ) is as in (4) . Proof: See the Appendix. Proposition 2: Let T = +∞. Then, C B (ι) is as in (5) . Proof: See the Appendix. The idea for the proofs is to split the cost in three terms: 1) economic cost: T 0 Ldt; 2) infected cost: T 0 idt; 3) infected squared cost: T 0 i 2 dt. The terms may be computed exactly by using Lemma 1 and by using the fact that in presence of control the dynamics is linear. In the next proposition we give a lower bound on the optimal cost, which will be used to compare Policy A and B with the optimal control. For technical reasons we restrict our analysis to policies that do not put control after herd immunity. This is a natural assumption, and numerical solutions show that this is indeed the case for most values of parameters. Proposition 3: Let T = +∞, and let t * be the smallest time such that s(t * ) ≤ γ/β (herd immunity). The cost of any control such that L(t) = 0 for every t ∈ [t * , +∞) is lower bounded by where U(i * ) = 2U I 12 (i * ) +U II 22 (i * ) +U 3 (i * ) and Remark 3: Notice that U(i * ) depends on i * both explicitly and through s(∞) and r(∞). Proof: See the Appendix. Apart from constants, U 3 is the infected cost, and U II 22 is the infected squared cost after herd immunity, computed starting from the state at time t * under the assumption that there is no control after herd immunity (so that economic cost is zero). These are computed using (1) with L(t) = 0 and Lemma 1, respectively. The term U I 12 is a lower bound for the sum of economic and infected squared cost before herd immunity. Let C opt denote the cost of the optimal control among all the controls such that after herd immunity do not put any lockdown. We can now derive bounds between performance of our two policies and C opt . Theorem 1: Let T = +∞. Then, Proof: The proof follows immediately from Propositions 1 and 3. Theorem 2: Let T = +∞. Then, Proof: The proof follows immediately from Propositions 2 and 3. In Fig. 5 the exact cost of the two policies, the cost under the optimal control computed numerically, and the lower bound on the optimal cost are plotted as a function of κ. Note that controlling i(t) is close to the optimal control computed numerically. The theoretical analysis allows to conclude that, for the tested parameters, controlling i(t) is better than controlling R(t), as the distance between cost of Policies A and B is comparable with the distance between Policy B and the lower bound C * . From a computational viewpoint, note that controlling i(t) instead of solving the optimal control, reduces the complexity of the problem from an optimization over a set of functions to a minimization over the parameter ι. In this work we consider a SIR with lockdown, and we study the optimal lockdown as solution to an optimal control problem. We model the cost as the sum of an epidemic cost, and economic cost due to lockdowns. We first show numerically that, as the time horizon and the cost of life vary, the solutions of optimal control exhibit a phase transition from suppression strategies to mitigation strategies, as described in [1] . Then, we introduce two simple feedback policies, one stabilizing the reproduction number, and one stabilizing the fraction of infected. We compute their exact cost, and compare analytically their performance with the optimal cost. We show that in case of infinite time horizon controlling the fraction of infected is close to optimal. Future directions include but are not limited to: investigation on Fig. 5 . Exact cost of Policies A and B with optimal ρ * and ι * (analytical), optimal cost (numerical) and lower bound to the optimal cost (analytical) as functions of κ. We noticed that T = 5 years is large enough to approximate infinite time horizon. Indeed, after a large enough time, both economic and epidemic cost are negligible (see Fig. 4 ). Other parameters are as in Fig. 1. where the phase transition occurs; extending the analysis to the case of finite time horizon, considering different cost functions like sigmoidal functions, and extending the analysis to network SIR, as in the numerical analysis of [5] . Before giving the proofs of the Propositions, we compute the asymptotic state of an uncontrolled SIR. Using (6) with b = +∞, and using the well known fact that lim t→∞ i(t) = 0 [17] , which implies r(∞) = 1 − s(∞), we get Proof of Proposition 1 Solving explicitly Phase I of the dynamics, one gets: Also, which allows concluding (13) The cost is thus composed of three parts: the economic cost, the infected square cost, and the infected cost (see Section IV-B). The infected cost is where r(∞) follows by plugging a = t * in (10) and using (11) . For the infected squared cost we separate the two phases. For Phase I: For Phase II, L(t) = 0, hence, by Lemma 1, where ν = ρ ρ−1 ,ν = (s 0 + νi 0 ) and we used (12) . Overall, the economic cost is where t * follows from (11) . Plugging the economic cost (17), the infected cost (14) and the infected square cost (15) and (16) in (13), we get (4). Proof of Proposition 2 s(τ 1 ) may be derived by (6) with a = 0, b = τ 1 , i.e., it is solution of The control time ∆τ := τ 2 − τ 1 may be computed by using the fact that i(t) = ι for every t ∈ [τ 1 , τ 2 ], i.e., From [8, Theorem 1] and from our form of economic cost The infected cost is Impact of non-pharmaceutical interventions (npis) to reduce covid19 mortality and healthcare demand Containing papers of a mathematical and physical character Optimal control of deterministic epidemics A simple planning problem for covid-19 lockdown, testing, and tracing A multi-risk sir model with optimally targeted lockdown Analysis and control of epidemics: A survey of spreading processes on complex networks Optimal control of an epidemic through social distancing Optimal epidemic suppression under an ICU constraint Covid-19 and flattening the curve: A feedback control perspective Optimal covid-19 epidemic control until vaccine deployment Beyond just "flattening the curve": Optimal control of epidemics with purely non-pharmaceutical interventions Controlling epidemic spread: reducing economic losses with targeted closures Transmission heterogeneities, kinetics, and controllability of sars-cov-2 The mathematics of infectious diseases This research was carried on within the framework of the MIUR-funded Progetto di Eccellenza of the Dipartimento di Scienze Matematiche G.L. Lagrange, Politecnico di Torino, CUP: E11G18000350001. It received partial support from the Compagnia di San Paolo through a Joint Research Project. It was also supported by C3.ai Digital Transformation Institute award. where r(∞) follows by plugging a = τ 2 in (10) with i(t 2 ) = ι, s(t 2 ) = γ/β . For the infected square cost, for Phase I and III we use result in Lemma 1, while Phase II is simply (τ 2 − τ 1 )ι 2 . Hence,where s(τ 1 ) comes from (18), τ 2 − τ 1 from (19), and s(∞) = 1 − r(∞). Summing infected square cost (22), infected cost (21) and economic cost (20) we obtain (5) . Let z := β si(1 − L) 2 , so that the dynamics iṡHence,with ε = 27/32, thus leading toIndeed, by defining y := z/(β si) ∈ [0, 1], the largest ε satisfying (23) isHence,By assumption, it exists a time t * such that s(t * ) = γ β . Hence, we can split the cost in two parts as follows:We start from the first part, i.e., [0,t * ], using the fact thatFor the second part, i.e., [t * , ∞], L(t) = 0 by assumption (however, even if L(t) > 0, this would be a lower bound), i.e.,For convenience, we do not split c 3 in two parts. Putting all together, the total cost iswhere U I 12 := t * 0 c 1 (t)c 2 (t)dt, U II 22 := ∞ t * c 2 2 (t)dt, and U 3 := ∞ 0 c 3 (t)dt. We now lower bound all of these terms as function of i(t * ), and then minimize over i(t * ) (which is function of the control itself) to get a lower bound. To this end, we establish the following equivalences. Let A = t * 0 i(t)dt, andWe can now lower bound all the terms appearing in the cost. We recall that i 0 and s 0 are given, and s(t * ) = γ/β .Thus, using 1 and 2,U II 22 can be computed exactly under the assumption L(t) = 0 for t ≥ t * . Hence, by Lemma 1, with a = t * and b = ∞, s(t * ) = γ/β , i(t * ) parametric, s(∞) = 1 − r(∞) being the asymptotic equilibrium computed by (10), we obtainFinally, for U 3 :Plugging (25), (26) and (27) in (24), we obtain the lower bound of the cost as a function of i(t * ). To obtain a lower bound on the cost we should then minimize over i(t * ). The statement follows from noticing that i(t * ) is upper bounded by the peak of the infection in case of no control, i.e., i max = i 0 + s 0 − γ β 1 − ln γ β s 0 [17] .