key: cord-0469400-jjma1l74
authors: Das, Arghya; Dhar, Abhishek; Goyal, Srashti; Kundu, Anupam
title: Covid-19: an analysis of an extended SEIR model and a comparison of different intervention strategies
date: 2020-05-23
journal: nan
DOI: nan
sha: 91016dd39a86a7bc5e4b257e69dae6ad063a37d4
doc_id: 469400
cord_uid: jjma1l74

Modeling accurately the evolution and intervention strategies for the Covid-19 pandemic, that has now affected almost every country in the world, is a challenging problem. We present here an analysis of an extended Susceptible-Exposed-Infected-Recovered (SEIR) model, that takes into account the presence of aymptomatic carriers, and explore the effects of different intervention strategies such as social distancing (SD) and testing-quarantining (TQ). The two intervention strategies (SD and TQ) try to reduce the disease reproductive number ($R_0>1$) to a target value ($R_0^{rm target}<1 $), but in distinct ways, which we implement in our model equations. We find that for the same target value, $R_0^{rm target}<1 $, TQ is more efficient in controlling the pandemic than lockdowns that only implement SD. However, for TQ to be effective, it has has to be based on contact tracing and the ratio of tests/per day to the number of new cases/per day has to be scaled with the mean number of contacts that an infectious person has, which would be high in regions with high population density and low levels of social distancing. We point out that, apart from $R_0$, an important quantity is the largest eigenvalue of the linearized dynamics which provides a more complete understanding of the disease progression both pre- and post- intervention and explains observed data. Weak extended intervention strategies (that reduce $R_0$ but not to a value less than $1$) can reduce the peak values of infections and the asymptotic affected population. We provide simple expressions for these in terms of the disease parameters and apply them in the Indian context to obtain heuristic projections for the course of the pandemic. Looking at real data, we find that for many countries, several broad qualitative features are captured well by the model.

Modeling accurately the evolution and intervention strategies for the Covid-19 pandemic, that has now affected almost every country in the world, is a challenging problem. We present here an analysis of an extended Susceptible-Exposed-Infected-Recovered (SEIR) model, that takes into account the presence of aymptomatic carriers, and explore the effects of different intervention strategies such as social distancing (SD) and testing-quarantining (TQ). The two intervention strategies (SD and TQ) try to reduce the disease reproductive number (R0 > 1) to a target value (R target 0 < 1), but in distinct ways, which we implement in our model equations. We find that for the same target value R target 0 < 1, TQ is more efficient in controlling the pandemic than SD. However, for TQ to be effective, it has has to be based on contact tracing and the ratio of tests/per day to the number of new cases/per day has to be scaled with the mean number of contacts that an infectious person has, which would be high in regions with high population density and low levels of social distancing. We point out that, apart from R0, an important quantity is the largest eigenvalue of the linearized dynamics which provides a more complete understanding of the disease progression both pre-and post-intervention and explains observed data. Weak extended intervention strategies (that reduce R0 but not to a value less than 1) can reduce the peak values of infections and the asymptotic affected population. We provide simple expressions for these in terms of the disease parameters and apply them in the Indian context to obtain heuristic projections for the course of the pandemic. Looking at real data, we find that for many countries, several broad qualitative features are captured well by the model.

[ * The authors in this work are listed alphabetically]

The Covid-19 pandemic, that started in Wuhan (China) around December 2019, has now affected almost every country in the world. The total number of confirmed case on May 13 was close to 4.5 million with close to 300, 000 deaths. One of the serious concerns presently is that there is as yet no clear picture or consensus on the future evolution of the pandemic. It is also not clear as to what the ideal intervention strategy that a government should implement, while taking into account also economic and social factors. The role of mathematical models has been to provide guidance for policy makers [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] .

One of the standard epidemiological model is the SEIR model which has four classes of susceptible (S), exposed (E), Infected (I) and Recovered (R) individuals with S + E + I + R = N being the total population of a region (the model can be applied at the level of a country or a state or a city and is expected to work better in any wellmixed population). The SEIR model is parameterized by the rates β (infectivity), σ, specifying E → I transitions and γ specifying I → R transitions. In terms of the data that is typically measured and reported, R corresponds to the total number of cases till the present date, while γI would be the number of new cases/per day. The number of deaths would be some fraction (≈ 5%) of R while the number of hospital beds required at any time would be ≈ (new cases × typical days to recovery). An important parameter characterizing the disease growth is the reproductive number R 0 -when this has a value > 1, the disease grows exponentially. Typical values reported in the literature for Covid-19 are in the range R 0 = 2 − 7 [11] . For the basic SEIR model one has R 0 = β/γ.

The two main intervention schemes for controlling the pandemic are social distancing (SD) and testingquarantining (TQ). Lockdowns (LD) impose social distancing and effectively reduce contacts between the susceptible and infected populations, while testingquarantining means that there is an extra channel to remove people from the infectious population. These two intervention schemes have to be incorporated in the model in distinctive ways [3, 4] -SD effectively changes the infectivity parameter β while TQ changes the parameter γ. Intervention schemes attempt to reduce this to a value less than 1. In the context of the SEIR model with R 0 = β/γ, it is clear that we can reduce R 0 by either decreasing β or by increasing γ. In this work we point out that for the same reduction in R 0 value, the effect on disease progression can be quite different for the two intervention strategies.

Here we analyze intervention strategies in an extended version of the SEIR model which incorporates the fact that asymptomatic or mildly symptomatic individuals [3] [4] [5] 12] are believed to play a significant role in the transmission of Covid-19. The extended model considers eight compartments of Susceptible (S), Exposed (E), asymptomatic Infected (I a ), presymptomatic Infected (I a ), and a further four compartments (U a , D a , U p , D p ), two each corresponding to the two infectious compartments. These last four classes comprise of individuals who have either recovered (at home or in a hospital) or are still under treatment or have died -they do not contribute to spreading the infection. We do not include separate compartments for the number of hospitalized and dead since these extra details would not affect our main conclusions. For this extended SEIR model we discuss the performance of two different intervention strategies (namely SD and TQ) in the disease dynamics and control. We specifically point out that strong interventions (R target 0 < 1, aimed at disease suppression) can be understood using the linearized dynamics, while for weak interventions (R target 0 1, aimed at disease mitigation), one has to go beyond the linear theory.

Main conclusions: Apart from the reproductive number, R 0 , an important parameter is the largest eigenvalue of the linear dynamics, which we denote as µ. For R (target) 0 > 1, we have µ > 0 and this gives us the exponential growth rate (doubling time ≈ 0.7/µ). On the other hand, for R target 0 < 1, the corresponding µ is less thab 0 and this tells us that infections will decrease exponentially. For the case where only mitigation is achieved (R target 0 1), we present analytic expressions for peak infection numbers, time to reach peak values, and asymptotic values of total affected populations, for the extended SEIR model. These provide useful guidance on disease progression and we apply it in the Indian context. We also show that, for the same reduction of R target 0 to a value less than 1, the corresponding µ magnitude can be very different for different intervention schemes. A larger magnitude of µ, corresponding to a faster suppression of the pandemic, is obtained from TQ than that from SD. We give conditions for TQ to be successful: (a) it has to be based on contact-tracing and (b) it is necessary that testing numbers are scaled up according to the number of new detected cases. We show that the above picture gives us a comprehensive understanding of data from several countries which have either achieved disease suppression or mitigation.

A note on the Indian situation: The number of daily new cases in India continues to rise and it is clear that only mitigation has been achieved, unlike in Europe and the US which have succeeded in suppression (R 0 < 1). The current disease doubling time is around 14 days. For two different choices for the the fraction of asymptomatics (and typical values of disease parameters) our estimates suggest a current value of R eff 0 ≈ 1.3 and that the disease would peak between July to September. The predicted number of hospitalizations and deaths per day (assuming 1% deaths for symptomatic cases) have a large uncertainty but could be quite large (see Tables (I-III) , and there is an urgent need of preparing for this. However, the lockdown in India is now being eased. Given the huge economic and social costs of implementing SD, it is clear that a combination of weaker SD but intense TQ might be the only practical way of controlling the pandemic in India. A sustained and targeted testing and quarantining strategy (assuming community spreading is still limited), combined with some level of socialdistancing has to be implemented at the earliest and to the fullest extent. Community transmission is unlikely to have taken place in all states and cities in India. Increasing the testing-to-detected ratio to a value around 100 from the current value of ≈ 25 could result in lowering of R 0 and the µ value or at least in not letting them increase further. The paper is structured as follows. In Sec. (II) we state our main results. In Sec. (III) we make qualitative comparisons of the predictions of the SEIR model with real data on confirmed number of cases and make some heuristic predictions in the Indian context. In Sec. (IV) we define and then present the analysis of the extended SEIR model with and without interventions. We present some useful analytic results and closed form expressions as well as numerical results on different intervention protocols. We summarize our results in Sec. (V). Technical details and various analytical results are presented in two appendices (Sec. (A) and Sec. (B)).

(I a ) σ (D a ) α 1 − α rν p rν

The extended SEIR model studied here is schematically described in Fig. (1) .

It has eight variables (S, E, I a , I p , U a , D a , U p , D p ) and ten parameters (β a , β p , σ, γ a , γ p , α, ν a , ν p , r, u). The details of the model are defined and explained in Sec. (IV). We note that at any given time the total infectious population size is I = I a + I p , the cumulative affected population (recovered, in hospital or dead) is R = U a + D a + U p + D p , the reported total confirmed cases is C = D a + D p + U p , and the reported new daily cases is D = dC/dt = rν a I a + (γ p + rν p )I p .

The parameter u quantifies the degree of social distancing while r is related to the rate at which testingquarantining is done.

These are in general timedependent, u changing from the free (without interventions) value u = 1 to a target value u l < 1, while r is a rate that changes from 0 to a value r l > 0. The time-scale for the change depends on how efficiently the control measures are implemented.

How is r related to testing rates ? It is easy to see that with random testing of the population, intervention can be helpful only if a finite fraction of the population is tested, which is typically impossible to implement. Thus testing has to be based on contact tracing of the new detected cases. Suppose that the number of tests per day is T while the number of new cases is D and A the typical number of contacts made by a person over the period when the person is infectious and before detection.

Main result (I): We argue that TQ intervention can be successful only if one achieves

where r(t), our control rate function changes from the value 0 to a value r l , which should be at least of the same order as γ p , over the time scales of a week or so. This means that we need T (t) ≈ AD(t), that is the number of tests/per day has to be to number of new detections/per day and in fact the ratio T /D has to be larger than the average number of contacts, A that each infectious person makes. The number A is expected to depend on the population density and also how well SD is being implemented. Hence, while T (t)/D(t) ≈ 25 for India appears to be large, it may not be sufficient given that the population densities are much larger than in many other countries and implementation of SD may be less effective. If we assume 20 contacts a day and the number of days before isolation of the individual to be 5 we get the rough estimate of A ≈ 100 and then the ratio T /D thus has to be at least ≈ 100. This is the minimum value of testing-to-detected ratio that has to be targeted at the hot zones. The argument presented here is largely independent of the details of the particular SEIR model that we study A useful quantity to characterize the system with interventions is the time-dependent effective reproductive number given by

At long times this goes to the targeted reproduction number

We classify intervention strategies by the targeted R target 0 value. A strong intervention is one where R target 0 < 1 and will achieve suppression of the disease while a weak intervention is one with R target 0 1 and will only mitigate the effects of the disease.

In our numerical study we choose, as an illustrative example, the parameter set α = 0.67 while the rates β a = 0.333, β p = 0.5, σ = 1/3, γ a = 1/8, γ p = 1/12 all in units of day −1 . For the specified choice of parameter values (free case with u = 1.0, r = 0.0) we get µ = 0.158 which is close to the value observed for the early time data for confirmed cases in India. The corresponding free value of R 0 is 3.7665. Note that µ is not uniquely fixed by R 0 and different choices of parameters can give the same observed µ but different values of R 0

Choosing these typical parameter values for Covid-19, we now compare the efficacy of strong and weak interventions implemented in four different ways: (1) 6WLD-NTQ: Six weeks lockdown (strong value of SD parameter) and no testing-quarantining, (2) ELD-NTQ: Extended lockdown and no testing-quarantining,(3)NSD-ETQ: No social distancing and extended testing-quarantining, (4) ESD-ETQ: Extended social distancing and extended testing-quarantining. The case with no social distancing and no testing-quarantining is indicated as NSD-NTQ.

For the strong intervention case the exponential growth stops around the time t (int) when R eff (t) crosses the value 1, provided it is imposed early enough. After this time, the infection numbers will start decaying exponentially. Since the infection numbers are still small compared to the total population, one can work with the linearized theory and the magnitude of the largest eigenvalue µ (now negative) determines the exponential decay rate.

We work with a population N = 10 7 and initial con-

In all cases, we will assume that intervention strategies are switched on when the confirmed number of cases reaches 50 and after that interventions are attained over a time scale of 5 days.

Parameter . (right) Total number of confirmed cases C = Up + Da + Dp. The dashed lines indicate the total affected population R = C + Ua at the end of one year, for the different strategies. In the absence of interventions this is close to 96% and is given by Eq. (7). The total population was taken as N = 10 7 . Observations: A six weeks (or eight week) lockdown is insufficient to end the pandemic and will lead to a second wave. If the interventions are carried on indefinitely, the pandemic is suppressed and only affects a very small fraction of the population (less than 0.1%). We can understand all features of the dynamics from the linear theory. In Fig. (2) , intervention is switched on after ≈ 2 weeks and the peak in infections shows roughly after a period of 5 days. Thereafter however, the decay in the number of infections occurs slowly, the decay rate being given by the largest eigenvalue µ (now negative and smaller in magnitude than µ in the growth phase).

Main result (II): We find that for the same target R target 0 < 1, different intervention schemes (ELD-NTQ, NSD-ETQ, or ESD-ETQ) can give very different values of the decay rate µ and, in general we find that TQ is more effective than SD. the corresponding µ values (post-intervention) are given by µ = −0.027 and µ = −0.077 respectively, i.e, they differ by a factor of about 3. With a mixed strategy where one allows almost three times more social contacts (u l = 0.431) than for LD case and that requires three times less testing (r l = 0.4) than for TQ case, we see that the disease is controlled in about 5 months. Hence this appears to be the most practical and effective strategy. 3. The expected time for the pandemic to die would be roughly given by

and so it is important that intervention schemes are implemented early and as strongly as possible. 1. In this case, a finite fraction of the population is affected. The intervention succeeds in reducing the fraction that is eventually affected and the peak number of infections and in delaying considerably the date at which the peak occurs.

Main result (III): We point out that these modified values can be obtained from the following simple expressions in terms of basic disease parameters. One can use these formulas either using the pre-intervention or post-intervention values of R 0 and µ. Assuming that we start with a small seed infected or exposed population and almost the entire population susceptible, i.e S(0) ≈ N , the peak value of infections, I (m) , (which is proportional to the number of hospitalizations required) and the number of days, t (m) , to reach this peak value are given by the simple general relations:

where γ e is an effective recovery rate [see Eq. (23) in Sec. (IV B)] and c is a constant that depends on initial infected population and other disease parameters. The fraction of population,x = R(t → ∞)/N , that is eventually affected is given by the solution of the equation

this result being valid for very general SEIRtype models with multiple compartments (see Appendix (A)). These relations are useful -for example, they give good estimates for the typical numbers for peak infections and when they happen (see below). We also provide relations for estimating the number of asymptomatic infected and recovered individuals.

2. We find that the peak infection numbers are smallest for the case with ELD-NTQ and occur at a later stage. Again these results can be understood mathematically from the expressions in Eq. (5) and Eq. (6) using the post-intervention values of γ and µ (from the linear theory).

3. We note that while weak interventions can slow down and reduce the impact of the pandemic, they do not lead to development of herd immunity of the population (assuming that all the recovered people develop immunity). It is well known that herd immunity is attained when a fraction 1 − 1/R 0 of the population has developed immunity. Thus herd immunity in the above example would require that 1 − R −1 0 ≈ 0.74, i.e 74% of the population be affected, while Eq. (7) with R target = 1.205 predicts that only about 31% of the population is affected.

Main result (IV): As already observed the linear theory is very useful to understand the growth and also decay time scales of the pandemic following strong interventions. Another observation that we make is that, independent of initial conditions, the vector describing all the system variables quickly points along the direction of the eigenvector corresponding to the largest eigenvalue. Hence (at such longish times) if we know one variable (or a linear combination), then the full vector is completely specified. This leads to an accurate way of specifying initial conditions for the numerics (from insufficient data). This implies that different initial conditions (such as different seed infections) will only cause a temporal shift of the observed evolution. This means that if we plot data for different countries, starting from the same initial value of say the confirmed number of cases (normalized by the population), we should see a collapse of the data. We test this idea and find that indeed an approximate collapse of data is obtained for a number of countries (see next section).

We do not attempt a detailed comparison of the model predictions with real data since there are too many poorly known parameters and possibly quite inaccurate knowledge of the initial conditions of the variables themselves. We make some overall qualitative observations relating real data to the predictions from SEIR-type models and find that in many cases, several broad qualitative features are remarkably well captured by the model. Figs. (2,3,4) . In particular we see the fast exponential growth phase and then a much slower decay phase for the first six countries which have succeeded in controlling the disease with various levels of success. On the other hand we see that India, Brazil and Russia continue to show a positive µ and it is clear that intervention schemes need to be strengthened.

One issue is that different countries start with different initial conditions (for example the seed exposed population could be very different between countries). As discussed in Sec. (A 2), as long as the number of confirmed cases is much smaller than the population size, a description in terms of the linearized dynamics is accurate. This would predict an initial exponential growth and then as intervention schemes begin to operate, the reproductive number and the corresponding growth exponent would decrease till eventually one is able to achieve R target 0 < 1 and correspondingly µ < 0. In Fig. (7) we show data for the reported number of new cases in 12 different countries and approximately see these features. Most countries have succeded in disease suppression R (target) 0 < 1, but show a slow exponential decay of the disease. A few Asian countries (India, Pakistan, Indonesia) have not yet entered the decaying phase which means that intervention has been weak and only disease mitigation has been achieved. This means that with the same level of intervention strategy, a finite fraction of the population will eventually be affected in these countries. We discuss later below in some more detail the Indian situation.

Comparing data across different coutries: The linearized SEIR dynamics also predicts that (see Sec. (A 2)), if one uses similar parameters and intervention parameters, then all countries should follow the same trajectory provided they start with the same value for the normalized fraction of confirmed new cases (D 0 /N ). We illustrate this idea, for the extended SEIR dynamics, in Fig. (6) where we show plots of I(t) = I a (t) + I p (t) and C(t) = D a (t) + U p (t) + D p (t) for 5 different initial conditions. The right panel shows a collapse of all the trajectories after an appropriate time translation of the different trajectories. Can we see a similar collapse of the real data for different countries (after normalizing by the respective populations and with appropriate time translation of the data) ? In the right panel of Fig. (7) we plot the data with this normalization and initial condition and see a rough collapse for several countries. We notice in particular that three of the Asian countries (India, Pakistan, Indonesia) follow a distinctly different trajectory -this could indicate either that the disease parameters are different or that the intervention strategies have been different, or the reporting of cases is inaccurate.

Predictions for India from extended SEIR model: In the following we make some heuristic predictions, based on the analytic results in Eqs. (5-7) and FIG. 5. Number of new cases per day for nine different countries. We note that the first six data sets exhibits the same broad features that we see for the model predictions in Fig. (2,3) . In particular we see the fast exponential growth and slow exponential decrease in new cases (following strong interventions). The two countries UK and US show a very slow decay rate, indicating that disease suppression has barely been achieved. The data for India, Brazil and Russia show the behavior corresponding to model predictions in Fig. (4) and have only been able to achieve mitigation so far (R target 0 > 1, µ > 0). Data from [15] the present observed data, for daily new cases in India (N ≈ 1.3 × 10 9 ), in the state of Delhi (N ≈ 1.9 × 10 7 ) and in the city of Mumbai (N ≈ 1.3 × 10 7 ). We consider the following choice of parameter values which appears to be quite typical: σ = 1/2, β p = β, β a = 2β/3, γ p = γ, γ a = 3γ/2 (i.e, assume that asymptomatics are less infectious and recover faster). This gives us [using Eq. (23)] γ e = γ/(1 − α/3) and the effective reproductive number as R eff 0 = (1 − 5α/9)β/γ. From this last relation we can write β = γR eff 0 /(1 − 5α/9). Plugging this into the equation for the eigenvalues, Eq. (17), and replacing λ by the observed mean exponential growth rate µ = 0.05 (the value observed for India since around April 10 [14] ), we see that we basically get an equation for R eff 0 in terms of σ, γ and µ.

For our analysis we need to know the total infections I(0) on some day and we estimate it on the date April 10 in the following way. Suppose that the daily observed cases on this day was D p (0) (assuming that only the symptomatics are detected). Then we have I p (0) = D p (0)/γ p . From Eq. (24) we have I Tables (I), (II) and (III) for India, Delhi and Mumbai. Note that while the peak numbers and total affected population and deaths simply scale with population size, the time to peak depends on the daily detected numbers on April 10, and this leads to the observed differences in the time to the peak for the three cases.

We point out that the mixed-population assumption of the SEIR model is expected to be more accurate for a smaller population and so the estimates for Delhi and Mumbai would be more reliable than the one for India. For a big and highly in-homogeneous country like India, smaller regions (states or cities) would have different values of µ and R 0 and also different initial conditions, hence the global values would not capture the local dynamics correctly. It is likely that the numbers in Table (I) are an over-estimate of the numbers for the true future trajectory. For the state of Delhi and the city of Mumbai, these should be more accurate, however we see that the uncertainty in the parameter values, for example the true value of α, leads to a huge uncertainty in the predictions.

Definition of the extended SEIR model: We consider a population of size N that is divided into eight compartments:

1. S = Susceptible individuals.

2. E = Exposed but not yet contagious individuals.

3. I a = Asymptomatic, either develop no symptoms or mild symptoms. 4. I p = Presymptomatic, those who would eventually develop strong symptoms. 5 . U a = Undetected asymptomatic individuals who have recovered.

6. D a = Asymptomatic individuals who are detected because of directed testing-quarantining, may have mild symptoms, and would have been placed under home isolation (few in India). 7. U p = Presymptomatic individuals who are detected at a late stage after they develop serious symptoms and report to hospitals.

8. D p = Presymptomatic individuals who are detected because of directed testing-quarantining.

We have the constraint that N = S +E +I a +I p +U a + D a + U p + D p . A standard dynamics for the population classes is given by the following set of equations:

The parameters in the above equation correspond to

• α: fraction of asymptomatic carriers.

• β a : infectivity of asymptomatic carriers.

• β p : infectivity of presymptomatic carriers.

• σ: transition rate from exposed to infectious.

• γ a : transition rate of asymptomatic carriers to recovery or hospitalization.

• γ p : transition rate of presymptomatics to recovery or hospitalization.

• ν a , ν p : detection probabilities of asymptomatic carriers and symptomatic carriers. Here we choose ν a = 1/3, ν p = 1/2,

• u: intervention factor due to social distancing (time dependence specified below).

• r: intervention factor due to testing-quarantining (time dependence specified below). This is a rate and depends on testing-quarantining rates.

With our definitions, the total number of confirmed cases, C, and the number of daily recorded new cases D would be

Note that we include U p because these are people who are not detected through directed tests but eventually get detected (after 1/γ p days) when they get very sick and go to hospitals. On the other hand the class D p get detected because of directed testing, even before they get very sick.

Since at early times S ≈ N and all the other populations E, I a , I p , D a , D p , U a , U p N , one can perform a linearization of the above equations. This tells us about the early time growth of the pandemic, in particular the exponential growth rate. For the present let us ignore the time dependence of the SD factor u and the TQ factor r. As shown in App. (A 2), the system has three non-zero eigenvalues given by the roots of the cubic equation:

whereβ a = uβ a ,β p = uβ p ,γ a = γ a + rν a ,γ p = γ p + rν p , Q = αβ a /(γ a +γ p ) + (1 − α)β p /(γ a +γ p ), and

is the expected form for the reproductive number for the disease. Noting the fact that Q < R 0 , it follows that the condition for at least one positive eigenvalue is

We denote the largest eigenvalue by µ. At early times the number of cases detected would grow as ∼ e µt . . Initial conditions: In Appendix. (A 2) we explain the fact that all initial conditions (which satisfy the condition S(0) ≈ N ) will quickly move along the direction of the dominant eigenvector and so all the trajectories for different initial conditions are identical up to a time translation. We also discuss in Appendix (A 2) how one can choose the correct initial conditions for all the dynamical variables given knowledge of just one of the observed variables (e.g the number of daily new reported infections).

Let us define the asymptotic populations (i.e the populations at very long times) in the different compartments asŪ a ,D a ,Ū p ,Ū p , and letR a =Ū a +D a ,R p =Ū p +D p , R =R a +R p . The total population that would eventually be affected by the disease (and either recovered or died) is given byR and would have developed immunity. A fractionŪ a (see below) would be undetected and uncounted.

Here for the moment let us assume that u and r do not have any time dependence. As shown in App. (A) the asymptotic fractionx =R/N is simply given by the solution of the equation

with R 0 being the reproductive number given by Eq. (18) . We note that Eq. (20) has a non-zero solution only when R 0 > 1. For the simple SIR model this result is well known [13] , here we show that this is valid quite generally. The asymptotic population of the individual populations are then given by

As shown in App. (B 1) for the SEIR model, the peak value of the infection number (I = I a + I p ) can be found from a heuristic argument and is very accurately given by the formula

We find that this also describes accurately the peak value for the extended SEIR dynamics with γ now replaced by

In Fig. (8) 

An estimate of the time to reach this peak value can be obtained by noting that we can use the linearized dynamics (see previous section) till the time I(t) reaches its peak I max . Hence we write I (m) = I(t (m) ) = I(0) e µt 

where c is a constant that depends on the initial infected population and the parameters. A verification of this result, obtained by solving the basic SEIR equations numerically, is provided in Fig. (8) .

We discuss here the choices of the intervention functions u and r introduced in the dynamical equations in Sec. (IV). Note that u is a dimensionless number quantifying the level of social contacts, while r is a rate which, as we will see, is closely related to the testing rate. 

We multiply the constant factors β a,p by the time dependent function, u(t), the "lockdown" function that incorporates the effect of a social distancing, i.e reducing contacts between people. A reasonable form is one where u(t) has the constant value (= 1) before the beginning of any interventions, and then from time t on it changes to a value 0 < u l < 1, over a characteristic time scale ∼ t w . Thus we take a form

The number u l indicates the lowering of social contacts.

Testing-quarantining (TQ): We expect that testing and quarantining will take out individuals from the infectious population and this is captured by the terms rν a I a and rν p I p in the dynamical equations. A reasonable choice for the TQ function is perhaps to take

where we one needs a final rate r l > 0. In general the time at which the TQ begins to be implemented t on and the time required for it to be effective t w could be different from those used for SD. A useful quantity to characterize the system with interventions is the time-dependent effective reproductive number given by

At long times this goes to the targeted reproduction number

The time scale for the intervention target to be achieved is given by t w and t w .

Relation of the TQ function r(t) to the number of tests done per day:

Let us suppose that the number of tests per person per day is given by T r . We show in Fig. (9) the data for the number of tests per 1000 people per day across a set of countries and see that this is around 0.05 for India which means that T r = 0.00005. If tests are done completely randomly, then the number of detected people (assuming that the tests are perfect) would be T r × I and so it is clear that we can identify r(t) = T r (t). It is then clear that this would have no effect on the pandemic control.

To have any effect we would need r γ p ≈ 0.1 which means around 100 tests per 1000 people per day which is clearly not practical.

However, a better strategy is to do focused tests on the contacts of all those who have been detected on a given day. In our extended SEIR model the number of detected cases per day is given by D(t) = rν a I a + (γ p + rν p )I p . Then, the number of contacts of these individuals would be AD(t) where A is the number of contacts one infected person made. A good assumption is to say that the infected people are from this pool. Hence, if we conduct a total of T = N T r tests per day on only this set of people, then the number of detected cases per day (through contact tracing) would be

Hence we see that TQ intervention can be successfully implemented if we can achieve

where we identify r(t) as our control rate function that changes from the value 0 to a value r l ≈ γ p over the time scales of a week or so. This means that we need T (t) ∼ AD(t). The implications of this is discussed after Eq. (1). In Fig. (9) we show data for daily new tests for a set of countries. A noteworthy case is the data for South Korea where we see the large testing rate at early days of the pandemic. Perhaps this explains the quick control of the pandemic in that country. The table in Fig. (10) shows data for the ratio T (t)/D(t) for a set of countries and also how this ratio has evolved over time.

As discussed in Sec. (II) the number A is expected to depend on the population density and also how well SD is being implemented and hence, for a country like India T (t)/D(t) ≈ 25 may not be sufficient.

A modified version of the SEIR model, incorporating asymptomatic individuals, was used for analyzing the effectiveness of different intervention protocols in control-ling the growth of the Covid-19 pandemic. Non-clinical interventions can be either through social distancing or through testing-quarantining. Our results indicate that a combination of both, implemented over an extended period may be the most effective and practical strategy. We point out that short-term lock-downs cannot stop a recurrence of the pandemic if interventions are completely relaxed and developing herd immunity is not a practical solution either since this would affect a very large fraction of the population.

We have provided numerical examples to illustrate the basic ideas and in addition have stated a number of analytical results which can be useful in making empirical estimates of various important quantities that provide information on the disease progression. Looking at real data for new Covid-19 cases in several countries, we find that the SEIR model captures some important qualitative features and hence could provide guidance in policymaking. We use our analytic formulas to make predictions for disease peak numbers and expected time to peak for India, the state of Delhi and the city of Mumbai, but point out that these predictions could be incorrect for India (due to big inhomogeneity in disease progression across the country) and perhaps more reliable for the cases of Delhi and Mumbai. In general we believe that our formulas are easy to use and give quick heuristic estimates on disease progression, which would be reliable when applied to local populations (in towns, cities and perhaps smaller countries). Lack of precise knowledge of the disease parameters (e.g the fraction of asymptomatic carriers) of course leads to rather large uncertainties in the predictions.

We thank Jitendra Kethepalli and Kanaya Malakar for very helpful discussions and Ranjini Bandyopadhyay, Siddhartha Chatterjee, Joel Lebowitz and Sriram Shastry for a careful reading of the draft and making useful suggestions. We acknowledges support of the Department of Atomic Energy, Government of India, under project no.12-R&D-TFR-5.10-1100.

where σ = n i=1 σ i . Let us assume that R i (0) = 0 for all i, and S(0) ≈ N . Then solving Eq. (A1)), we get

Multiplying Eqs. (A4) by β i /γ i , summing over i and integating time from 0 to ∞, we get n i=1 β i 

Next we note that (d/dt)(I i + R i ) = σ i E. Hence for the initial condition I i = R i = 0 we find that the ratio [I i (t)+R i (t)]/[I j (t)+R j (t)] = σ i /σ j at all times. Since at large times I i → 0, this means that the asymptic values of R i s are given byR

Using this in Eq. (A6), noting thatS +R = N and defining x =R/N , we then get the following simple equation that determines the asymptotic total affected population:

is the reproductive number.

We now again focus on the special case with the n = 8 variable dynamics described by Eqs. (8) (9) (10) (11) (12) (13) (14) (15) ). Let us denote the variables by x 1 = S − N, x 2 = E, x 3 = I a , x 4 = I p , x 5 = U a , x 6 = D a , x 7 = U p , x 8 = D p . At early times when x i << N , the dynamics is captured by linear equations 

whereβ a = uβ a ,β p = uβ p ,γ a = γ a + rν a ,γ p = γ p + rν p . This has 5 zero eigenvalues while the remaining 3 ones are given by the roots of the cubic equation for λ:

This can be written in the form

where Q = αβ ã γ a +γ p

and

We identify R 0 with the reproductive number of the disease. Noting the fact that Q < R 0 , it is easy to prove that the necessary condition for at least one positive eigenvalue is

Let us denote the largest eigenvalue by µ. For R 0 ≈ 1, we expect that the largest eigenvalue is close to zero and from Eq. (A15) we can read off the value as

.

A way to choose correct initial conditions from knowledge of one variable (e.g confirmed cases) at an early time: We denote the right and left eigenvectors corresponding to the eigenvalue µ by φ m (i) and χ m (i) respectively. The time evolution of the vector X = (x 1 , x 2 , x 3 , x 4 , x 5 , x 6 , x 7 , x 8 ) is given by

where the last line is true at sufficiently large times when only one eigenvalue λ dominates. Let us consider the initial condition X = (− , 0, 0, , 0, 0, 0, 0) so that (noting that χ m (1) = 0)

x i (t) ≈ φ m (i)χ m (4)e µt = a i e µt ,

where a i = φ m (i)χ m (4). At a sufficiently large time t l (but still in the very early phase of the pandemic) we equate the observed confirmed number C 0 on some day to x 6 (t l ) + x 7 (t l ) + x 8 (t l ) which therefore gives us the relation e µt l = C 0 a 6 + a 7 + a 8 .

This then tells us that we should start with the following initial conditions, counting now time from t = 0:

The crucial point is that the leading eigenvector fixes the direction of the growth and then knowledge of linear combination fixes all the other coordinates.

2. E = Number of Exposed but not yet contagious individuals.

3. I = Number of Infected contagious individuals 4. R = Number of Recovered, hospitalized or dead individuals.

The dynamics of this model can be described as follows:

• The infected individuals, I, come in contact with the susceptible population , S, and cause transitions S → I.

• People who are Exposed carry the virus, do not yet show symptoms and cannot infect others.

• After a latency period T L the Exposed people become Infected and can now infect others, so E → I happens at a rate σ = 1/T L . These people could either be symptomatic or asymptomatic and their diseases are yet un-detected.

• We assume that infected people typically either recover or are detected after T R days, so I → R happens at a rate σ = 1/T R .

We then have the following equations for the dynamics for the system

In this case the reproductive number is simply given by R 0 = β/γ.

To determine the fraction of the population that would be affected finally if there was no intervention, we first note from Eqs. (B1,B4) that We now evaluate the peak value I max of the infected population in the course of the outbreak. We first note that the equation (B6) allows us to express the susceptible population at any time t as a function of R(t). In fact, one can express all the other populations in terms of R(t) or its time derivatives, such as 

where, x = R/N . Defining v = dx dt = γI/N , we see that the four dimensional SEIR-dynamics is equivalent to a two-dimensional dynamical system specified by the equations

The above equation resembles a damped oscillator constrained to move in the positive half line and in a potential U (x) = γσ(x 2 /2 − x) + (γ 2 σ/β)e −R0x so that F (x) = −U (x) = −γσ(x + e −R0x − 1). The nontrivial fixed point, which is the steady state, is given by the zero of F (x), as already obtained in earlier section. On the other hand, the peak of the infected population is given by setting dI/dt = 0 or dv/dt = 0, which implies v (m) = − γσ γ+σ x m + e −R0xm − 1 , where x (m) , v (m) denote the values of x, v at the time when I peaks. To determine (x (m) , v (m) ) we need another equa-

Early dynamics of transmission and control of COVID-19: a mathematical modelling study

Impact of nonpharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand

An updated estimation of the risk of transmission of the novel coronavirus (2019-nCov)

Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy

Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures, Proceedings of the National Academy of Sciences

May 9 (black bar) and (right) the change over time of this ratio. Data from [16] fects of a historic national lockdown in India's response to the COVID-19 pandemic: data science call to arms

Modeling and forecasting of the COVID-19 pandemic in India

INDSCI-SIM: A state-level epidemiological model for India

COVID-19 Epidemic: Unlocking the lockdown in India (working paper)

Alternating quarantine for sustainable mitigation of COVID-19

The reproductive number of COVID-19 is higher compared to SARS coronavirus

Who is the infector? Epidemic models with symptomatic and asymptomatic cases

Covid-19 data analysis

neering Coronavirus COVID-19 Global Cases

COVID-19) Testing

Appendix A: Extended SEIR model

Let us consider a more general form of the SEIR equations with n compartments for the infectious population with I 1 , I 2 , . . . , I n , n compartments for the recovered population with R 1 , R 2 , . . . , R n and the other 2 compartments of S, E, R with the following dynamicsAppendix B: Analysis of the basic SEIR modelIn the standard SEIR model one divides a population of size N into four compartments of 1. S = Number of Susceptible individuals. tion which could be obtained for example from a solution of the equation for dv/dx. This is difficult to calculate exactly. However we can obtain a second equation if we make the reasonable assumption that I and E peak at around the same time, which simply givesThen using the overall constraint N = S + E + I + R we finally obtainAn estimate of the time to reach this peak value can be obtained by noting that we can use the linearized dynamics (see previous section) till the time I(t) reaches its peak I (m) . Hence we writeHencewhere c is a constant that depends on initial infection numbers and parameter values.

To get the growth at early time regime let us define the variables S = N + s, E = e, I = i, R = r. Inserting these in Eqs. (B1, B2, B3) and (B4), and then expanding the right hand sides of each equations to linear order, we getThis set of linear equations can be solved by diagonalizing the matrix M . It has eigenvaluesLet us denote the right and left eigenvectors corresponding to the eigenvalue λ q by φ q (i) and χ q (i) respectively.The right eigenvectors are given by φ 1 = (0, 0, 0, 1), φ 2 = (1, 0, 0, 0), φ 3 = (−β/γ, λ 3 (λ 3 + γ)/(σγ), λ 3 /γ, 1), φ 4 = (−β/γ, λ 4 (λ 4 + γ)/(σγ), λ 4 /γ, 1).

We denote the largest eigenvalue λ 4 ≡ µ = [−(σ + γ) + (σ − γ) 2 + 4βσ]/2 and it is easy to see that this is positive for β/γ = R 0 > 1.It is instructive to examine the structure of µ near R 0 = 1. For this we rewrite this in the formOne qualitative aspect that this equation tells us is the following. Suppose we start with free parameters β, γ such that R 0 = 1.8 and want to change (through interventions) the reproductive number to a target value R 0 (target) = R 0 /2 = 0.9. We can do this either (a) by decreasing β to β = β/2 or (b) by increasing γ to a value γ = 2γ. It is clear from the above expression that (b) would lead to a negative eigenvalue of larger magnitude and so a faster decay of the disease.