key: cord-0448556-4la6bpsf authors: Mondal, Dipankar; Chakrabarty, Siddhartha P. title: Did the lockdown curb the spread of COVID-19 infection rate in India: A data-driven analysis date: 2020-06-22 journal: nan DOI: nan sha: 8ed6c15a22ae9a5f19ac8d947c9098390a222e99 doc_id: 448556 cord_uid: 4la6bpsf In order to analyze the effectiveness of three successive nationwide lockdown enforced in India, we present a data-driven analysis of four key parameters, reducing the transmission rate, restraining the growth rate, flattening the epidemic curve and improving the health care system. These were quantified by the consideration of four different metrics, namely, reproduction rate, growth rate, doubling time and death to recovery ratio. The incidence data of the COVID-19 (during the period of 2nd March 2020 to 31st May 2020) outbreak in India was analyzed for the best fit to the epidemic curve, making use of the exponential growth, the maximum likelihood estimation, sequential Bayesian method and estimation of time-dependent reproduction. The best fit (based on the data considered) was for the time-dependent approach. Accordingly, this approach was used to assess the impact on the effective reproduction rate. The period of pre-lockdown to the end of lockdown 3, saw a $45%$ reduction in the rate of effective reproduction rate. During the same period the growth rate reduced from $393%$ during the pre-lockdown to $33%$ after lockdown 3, accompanied by the average doubling time increasing form $4$-$6$ days to $12$-$14$ days. Finally, the death-to-recovery ratio dropped from $0.28$ (pre-lockdown) to $0.08$ after lockdown 3. In conclusion, all the four metrics considered to assess the effectiveness of the lockdown, exhibited significant favourable changes, from the pre-lockdown period to the end of lockdown 3. Analysis of the data in the post-lockdown period with these metrics will provide greater clarity with regards to the extent of the success of the lockdown. As of 5th June 2020, the coronavirus disease 2019 (COVID- 19) with its epicenter in Wuhan, China [1] , has resulted in more than 6.5 million confirmed cases and 3, 87, 155 causalities [2] . The global pandemic resulting from COVID-19 was preceded by two other outbreaks of human coronavirus, in the 21st century itself, namely, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) infections [3] . The possibilities of the source of the transmission of COVID-19 outbreak includes (but is not limited to) animals, human-to-human and intermediate animal-vectors [3] . The index case for COVID-19 outbreak in India was reported on 30th January 2020, in case of an individual with a travel history from Wuhan, China [4] . The data available on [4] , suggests that during the early stages, the COVID-19 positive cases in India, were limited to individuals with a travel history involving the global hotspots of the outbreak. However, subsequently, cases were detected in individuals who neither had a travel history involving the global hotspots, nor had any contact with individuals who were already infected, which indicated the possibility of community outbreak. This resulted in the Government of India announcing a lockdown across the country, driven by the necessity of ensuring that the social distancing norms are strictly observed. While the lockdown was not the only response to the pandemic, it was a very crucial step towards curbing the growth of COVID-19 in densely populated countries, like India. Given the concurrent economic cost of the lockdown, it is even more critical from the epidemiological as well as economic perspective, to assess its effectiveness. This paper presents a datadriven analysis to examine the effectiveness of the lockdown, with an emphasis on the question as to whether the lockdown succeeded in curbing the intensity of COVID-19 spread rate in India ? In order to answer this, we empirically analyze four different metrics, namely, reproduction number, growth rate, doubling time and death to recovery ratio, which quantify the transmission rate, the growth rate, the curvature of epidemic curve and the improvement of health care capacity, respectively. We now give a brief summary of some of the available literature on quantitative approaches to the modeling of transmission of COVID-19 outbreak. A system of ordinary differential equation (ODE) driven model for phasic transmission of COVID-19, was analyzed for calculating the transmissibility of the virus, in [5] . Kucharski et al. [1] considered a stochastic transmission model on the data for cases in Wuhan, China (including cases that originated there) to estimate the likelihood of the outbreak taking place in other geographical locations. A literature survey by Liu et al. [6] , summarized that the reproductive number (and hence the infectivity) in case of COVID-19, exceeded that of SARS. A Monte-Carlo simulation approach to assess the impact of the COVID-19 pandemic in India, was carried out in [7] . In carrying out the mathematical and statistical modeling of COVID-19, it would be helpful to refer to the quantitative models analyzed in case of the two preceding outbreaks of human coronavirus, namely SARS and MERS. In [8] , a network model was analyzed to identify localized hotbeds, as well as super-spreaders for SARS. Constrained by somewhat limited availability of data, a simple compartment model was used in [9] , for in-silico predictive analysis of SARS outbreak in Beijing, China. Yan and Zou [10] , determined the optimal and sub-optimal strategies for quarantine and isolation in case of SARS. A predictive model in [11] , on imported cases of MERS, was used to ascertain the likelihood of a MERS diagnosis, during the time window between immigration and onset of the disease. The trajectory of MERS outbreak was calibrated to a dynamic model in [12] , with the goal of studying the role of time, in implementing the control measures. A key identifier for the transmissibility of epidemiological diseases such as COVID-19 is the basic reproduction number R 0 , which is defined as the average number of secondary infections resulting from an infected case, in a population whose all members are susceptible. Accordingly, we seek to estimate the data-driven value of R 0 , for the outbreak of COVID-19 in India. Further, we also seek to determine the time-dependent reproduction number R t , for better clarity on the time-variability of the reproduction number, particularly in the paradigm of its dynamics during the phases of the nationwide lockdown in India. In addition, we also estimate and analyze the statistical performance of growth rate, doubling time and death to recovery ratio. The paper is organized as follows. In Section 2, we detail the source of the data as well as the statistical approaches used for the estimation of R 0 and R t . This will be followed by the discussion of the results for the outbreak in India, in Section 3. In Section 4, we present the data driven analysis of the impact of the lockdown. And finally, in the concluding remarks in Section 5, we highlight the main takeaways for this analysis. The data of incidences used for the analysis reported in this paper was obtained from the website of India COVID 19 Tracker [4] , and used for the purpose of estimation of R 0 . This estimation was carried out making use of the R0 package [13] of the statistical package R. The standardized approach included in the R0 package includes the implementation of the Exponential Growth (EG), Maximum Likelihood (ML) estimation, Sequential Bayesian (SB) method and estimation of time dependent reproduction (TD) numbers, used during the H1N1 pandemic of 2009. The package is designed for the estimation of both the "initial" reproduction number, as well as the "time-dependent" reproduction number. Accordingly, we present a brief summary of the four approaches used in the paper. 1. Exponential Growth (EG): As observed in [14] , the reproduction number can be indirectly estimated from the rate of the exponential growth. In order to address the disparity in the different differential equation models, the authors observe that this disparity can be attributed to the assumptions made about the shape of the generation interval distribution. Accordingly, the choice of the model, used for the estimation of the reproduction number, is driven by the shape of the generation interval distribution. Based on the assumption that the mean is equal to he generation intervals, the authors obtain the important result of determining an upper bound on the possible range of values of the reproduction number for an observed rate of exponential growth, which manifests into the worst case scenario for the reproductive number. Let the function g(a) be representative of the generation interval distribution. If the moment generating function M (z) of g(a) is given by M (z) = ∞ 0 e za g(a)da, then the reproduction number is given by exists. In particular, the Poisson distribution can be used in the analysis of the integer valued incidence data [15, 16] , for (discretized) generation time distribution. An important caveat is that this approach is applicable to the time window in which the incidence data is observed to be exponential [13] . 2. Maximum Likelihood (ML) estimation: The maximum likelihood model as proposed in [17] is based on the availability of incidence data N 0 , N 1 , . . . , N T , with the notation N t , t = 0, 1, 2, . . . , T denoting the count of new cases at time t. In practice, we take the index t in days, while noting that this indexing is applicable for other lengths of time intervals. This approach is driven by the assumption that the Poisson distribution, models the number of secondary infections from an index case, with the average providing the estimate for the basic reproduction number. If we denote the number of observed incidences for consecutive time intervals by n 1 , n 2 , . . . , n T and let p i denote the probability of the serial interval of a case in i days (which can be estimated apriori), then the likelihood function is the thinned Poisson: n t−i p i and p = (p 1 , p 2 , . . . , p k ). The absence of data from the index case can lead to an overestimation of the initial reproduction number, and accordingly a correction needs to be implemented [13] . 3. Sequential Bayesian (SB) method: A SIR model driven sequential estimation of the initial reproduction number was carried out by the sequential Bayesian method in [18] . It is based on the Poisson distribution driven estimate of incidence n t+1 at time t + 1 with the mean of n t e γ(R−1) . In particular, the probability distribution for the reproduction number R, based on the observed temporal data is given is the prior distribution of R and P [n 1 , n 1 , . . . , n t+1 ] is independent of R. The TD method is amenable to the computation of the reproduction numbers through the averaging over all networks of transmission, based on the observed data [19] . Let i and j be two cases, with the respective times of onset of symptoms being t i and t j . Further, let p ij denote the probability of i being infected by j. If g(a) denotes the distribution of the generation interval, . Accordingly, the effective reproduction number is given by R j = i p ij , whose average is then given by R t = 1 n t t j =t R j . In absence of observed secondary cases, a correction can be made to the time dependent estimation [20] 3 ESTIMATING THE REPRODUCTION NUMBERS AND FITTING THE EPIDEMIC CURVE In this section, we undertake the fitting of the epidemic curve and the estimation of the reproduction numbers using the approaches enumerated in Section 2. We have obtained the daily incidence data, for the period of 2nd March 2020 to 31st May 2020 [4] . The epidemic curve based on the data, for this period, is depicted in Figure 1 , which indicates that the number of COVID-19 positive cases, were growing in an almost exponential manner. The initial reproduction number R 0 according to the EG is For the estimation of time-varying reproduction numbers or the effective reproduction numbers R t , the generation time distribution is required. Accordingly, we use gamma distribution with mean of 5.2 days and the standard deviation of 2.8 days as reported from China [21] . Now, the average R t , using the SB and TD methods, are 1.591 and 1.68, respectively. The R 0 values using EG and ML, and the R t values using SB and TD, along with the corresponding 95% confidence intervals are tabulated in Table 1 . Further, the seven-day rolling R t , obtained for the cases of SB and TD, are plotted in Figure 2 and Besides estimating the reproduction rate, we fit the epidemic curve, making use of the four models, namely EG, ML, SB and TD. Accordingly, the predicted incidence (based on the fitted model parameters in each case) and the observed incidence for each method, are illustrated in Figure 4 . The prediction provided by the EG, ML and TD, are reasonably close to the actual cases. However, it is clearly observed that the most poorly fitted model is the SB model. The SB model overestimates the epidemic curve, and thus the predictions according to this model are much higher than the actual incidences. Therefore, in order to find the best-fitted model, the root mean squared errors, RM SE := n i=1 (ŷ i − y i ) 2 n , for all the models were calculated. As expected, the RMSE for the SB model is the highest. On the other hand, the TD model has the lowest RMSE. The RMSE values for all the four models are tabulated in Table 2 , from where we can conclude (based on the data set considered) that the best model for the estimation of the COVID-19 epidemic in India, is the TD model. The nationwide lockdown was imposed, on 25th March, 2020, with the goal of arresting the spread of infection, through strict restrictions on mass movement and encouraging social distancing, and it was expected that the spread rate would come down, along with the reduction in the possibility of community transmission. This in turn would result in curbing the number of cases from rising dramatically, thereby enabling the healthcare system with more time to make necessary arrangements for the better preparedness of the medical infrastructure. Thus, the first phase of lockdown until 14th April, 2020, was extended to another two phases of lockdown, with slightly relaxed restrictions, and were enforced from 15th April to 3rd May, 2020 and from 4th May to 31st May, 2020. This section discusses the impact of the entire lockdown on COVID-19 spread, by analyzing various metrics, namely, the effective reproduction rate, the growth rate, the doubling time and the death to recovery ratio. One of the key mathematical indicator relied upon, in the paradigm of the spread of COVID-19 pandemic and consequent policy decisions is the effective reproduction rate (ERR) or the time-varying reproduction number. As ERR provides the information of time varying transmission rate, it would be a natural choice to measure the impact of the entire lockdown, as well as different phases of the lockdown. In the preceding Section 3, we have shown that, amongst all the models, the TD is the best fitted model, for the Indian epidemic curve. Hence, we discuss the impact of lock-down in the context of the TD-based R t . (b) Average seven-day R t Figure 5 : Impact of lockdown on ERR Figure 5a depicts the seven-day rolling ERR. It is clearly observed that, before the lockdown, the R t was unsteady, but it started dipping downward after the commencement of the lockdown. In the pre-lockdown period, the average seven-day ERR was 2.23. Therefore, before the lockdown, if 100 individuals had COVID-19, they would have infected 223 people on an average. In the first lock-down period, the average ERR came down to 1.73, a 22% drop. Thus, at this rate, 100 carriers would infect 173 others on an average. In the second and third lockdown periods, the ERR furthers dipped to 1.31 and 1.22, respectively. Therefore, from the pre-lockdown to the end of lockdown 3, the overall rate of reduction of ERR was nearly 45%. Figure 5b displays the phase-wise 1 average R t . The descriptive statistics of R t and the corresponding confidence intervals are described in Table 3 . From these results, we can clearly infer that, so far, the lockdown has by and large succeeded, in reducing the ERR. However, this observation come with the caveat that the three successive lockdowns did not drive the R t below 1, which is suggestive that the epidemic may exhibit a surge once all the restrictions are lifted. 4.2 IMPACT ON GROWTH RATE The reduction of ERR should further reduce the growth rate of daily incidences. In order to see the growth rate, in a particular time period, we calculate the seven-day rolling growth rate in that period, and then take the average. Suppose that we have daily incidence numbers, D(t), t = 1, 2, 3, . . . , 20, for a period of 20 days. We first compute the seven-day rolling growth rates, , where i = 1, 2, 3, . . . , 13, and we get a dataset of 13 points. Finally, the simple mean of the dataset is calculated. If the seven-day average growth is 30% in a month, then the average weekly number of positive cases would have increased from 100 to 130 in that month. Growth rate (%) Figure 6 : Weekly growth rate of positive cases Figure 6 illustrates the average weekly growth rate in different time periods. In the pre-lockdown period (L0), the growth rate was 393%. It means that the weekly number of positive cases, increased drastically from 100 to 493 in the pre-lockdown period. The growth rate has decreased to 191% in lockdown 1 (L1). It further reduced to 47% and 32% in lockdown 2 (L2) and lockdown 3 (L3), respectively. Therefore, we can conclude, that the implementation of nationwide lockdown has resulted in slowing down the growth rate of COVID-19 positive cases. One of the key indicator to see the spread of any pandemic is the doubling time. It is referred to as the time (usually counted in number of days) it takes for the total number of cases to double. The doubling time of n days means that if there were 100 cases at day 0, then, on day n, the number of cases would be 200. The more the doubling time is, the more the possibility of achieving a flattened epidemic curve. Figure 7b . The increment in doubling time is clearly visible from this figure. Therefore, from these results, we infer that the doubling time has improved significantly after the enforcement of nationwide lockdown. In a pandemic, the performance of any nation's health care system, is measured ultimately in terms of deaths and recoveries. This segment discuses the effect of lockdown on death to recovery ratio (DTR). The DTR is defined as a ratio between total number of deaths and total number of recoveries: Total number of deaths upto time t Total number of recoveries upto time t . The DTR stipulates the clinical management ability or the efficiency of health system. It is highly important to keep the value of the DTR as low as possible. Mathematically, the closer this value is to zero, the better the efficiency of healthcare system, in dealing with the pandemic. For example, DT R t = 0.5 implies that, for every 100 recoveries, 50 infected patients would have died. The seven-day rolling DTR is plotted in Figure 8a . It is clearly seen that the DTR has declined significantly as time has progressed. The phase-wise bar chart also depicts the reduction of DTR over the period of three months. In pre-lockdown (L0) and lockdown 1 (L1) periods together, the average DTR was 0.28. It reduced to 0.14 in lockdown 2 (L2) and further declined to 0.08 in lockdown 3 (L3), which shows that, in this short period, the Indian health care system has been improved significantly to tackle the COVID-19 pandemic. Figure 8 : Impact of lockdown on death to recovery ratio 5 CONCLUSION In this paper, we have discussed the impact of lockdown on COVID-19 infection rate, in India. The aim was to see whether the lockdown has really curbed intensity of spread. In order to do that, we empirically analyzed different metrics that mainly measure the spread of infectious disease, like COVID-19. The metrics are effective reproduction rate, growth rate, doubling time and death to recovery ratio (DTR). For case of ERR, it is seen that the lockdown has reduced the reproduction rate by more than 40%. The growth rate has also substantially decreased from the initial period to the end of lockdown. On the other hand, the doubling time has largely improved over the three month period. The rate of increment from pre-lockdown to lockdown 3 is nearly 183%. Finally, we described the impact on DTR, which quantifies the number of death against the number of recoveries. We observed significant downfall of DTR from the month of April. On average, the initial DTR of 0.28 has dipped downward to 0.08 at the third phase of lockdown. Therefore, despite rising cases of COVID-19 infection in India, the lockdown has managed to curb the spread to some extent. However, the caveat is that, despite the encouraging results, the pandemic will persist, unless the ERR is driven below 1. It remains to be seen if there is a adverse movement of the metrics, after the relaxation of the restrictions. The behaviour of these metrics in post-lockdown period will provide a more accurate and complete information regarding the success or failure of lockdown. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The Lancet Infectious Diseases COVID-19) Situation Report -137 of World Health Organization Return of the Coronavirus: 2019-nCoV A mathematical model for simulating the phase-based transmissibility of a novel coronavirus The reproductive number of COVID-19 is higher compared to SARS coronavirus Healthcare impact of COVID-19 epidemic in India: A stochastic mathematical model Super-spreaders and the rate of transmission of the SARS virus Simulating the SARS outbreak in Beijing with limited data Optimal and sub-optimal quarantine and isolation control in SARS epidemics Probabilistic differential diagnosis of Middle East respiratory syndrome (MERS) using the time from immigration to illness onset among imported cases A dynamic compartmental model for the Middle East respiratory syndrome outbreak in the Republic of Korea: a retrospective analysis on control interventions and superspreading events The R0 package: a toolbox to estimate reproduction numbers for epidemic outbreaks How generation intervals shape the relationship between growth rates and reproductive numbers A preliminary estimation of the reproduction ratio for new influenza A (H1N1) from the outbreak in Mexico Estimating the effective reproduction number for pandemic influenza from notification data made publicly available in real time: a multi-country analysis for influenza A/H1N1v A likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic Real time bayesian estimation of the epidemic potential of emerging infectious diseases Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures Real-time estimates in early detection of SARS Estimating the generation interval for COVID-19 based on symptom onset data This work was carried out under approved Grant No. MSC/2020/000049 from the Science and Engineering Research Board, Government of India.