key: cord-0789328-ola35o7h authors: Tiwari, Alok title: Modelling and analysis of COVID-19 epidemic in India date: 2020-11-28 journal: nan DOI: 10.1016/j.jnlssr.2020.11.005 sha: 2c9910e1dd51211b1d6f514c09e09d2fc7f0f571 doc_id: 789328 cord_uid: ola35o7h COVID-19 epidemic is declared as the public health emergency of international concern by the World Health Organisation in the last week of March 2020. This disease originated from China in December 2019 has already caused havoc around the world, including India. The first case in India was reported on 30th January 2020, with the cases crossing 4 million on the day paper was written. This pandemic has caused more than 80,000 fatalities with 3 million recoveries. Strict lockdown of the nation for two months, immediate isolation of infected cases and app-based tracing of infected are some of the proactive steps taken by the authorities. For a better understanding of the evolution of COVID-19 in the world, study on evolution and growth of cases in India could not be avoided. To understand the same, one of the compartment model: Susceptible-Infectious-Quarantined-Recovered (SIQR) is used. Recovery rate and doubling rate of the total reported positive cases in the country had crossed 75% and 25 days, respectively. It is also estimated that there is a strong positive correlation between testing rate and detection of new cases up to 6 million tests per day. Using the SIQR modelling effective reproduction number, epidemic doubling rate and infected to quarantined ratio is determined to check the temporal evolution of the pandemic in the country. Effective reproduction number that was at its peak during first half of the April is gradually converging to 1. It is also estimated using this model that with each detected cases in India, there could be 10-50 undetected cases. Like every mathematical model, this model also has some assumptions. To make this model more robust, a technique with weighted parameter that can avoid a person with a strong immune system to be equally vulnerable to the infection, can be worked out. Machine learning algorithms can also be used to train our model with the data of other countries to make the analysis and prediction more precise and accurate. Coronavirus disease 2019 (COVID-19) is originated from the Wuhan city of the Hubei province (China) in December 2019. This outbreak has caused global pandemic with more than 28.5 million positive cases and 0.92 million deaths. India, with a population of more than 1.3 billion, has witnessed its first positive case on 30 th January 2020. At present more than 4 million individuals are reported positive with 77,461 fatalities and 3 million recoveries [1] . Government of India has imposed strict nationwide lockdown in four phases, starting from 25 th March, of more than two months ending on 31 st May. Also, the country had initiated the gradual unlockdown under different phases that include the opening of industries, railways, metros, shopping malls, schools and colleges. In the tally of the number of positive cases, India is at the second number just behind the USA and is reporting more than 90,000 cases each day [1] . Despite the huge number of positive cases per day, deaths per million of the population is at the lowest (56 per million of the population) [1] . A detailed mathematical analysis for the evolution of the pandemic in the country has the potential to assist medical researchers, policy makers and other stake holders. In this work, analysis of the COVID-19 cases in India assisted with a mathematical modelling is performed with cases up to 22 nd August 2020. Mathematical equations are widely used to model the nature and impact of global pandemics in the society. The SIR model [2] is the classically adopted mathematical model to analyse and predict the evolution of a disease. Its one of the variant SIQR [3] is considered to be the best modelling technique for COVID-19, where isolation of infectious plays an important role. SIQR technique can be used to determine the parameters that quantify the growth and evolution of cases in a region using the past trend. This modelling approach to analyse the impact of this disease is performed for few affected nations, including Brazil [4] and Italy [5] . Particularly about India, studies regarding power-law growth [6] and the effect of lockdown on the spread of disease [7] have been reported. However, none of the papers has quantified the evolution of the disease in terms of epidemiological parameters. Present work has analysed the evolution of COVID-19 in India using SIQR model. Parameters and indicators that quantify the temporal evolution and growth of the disease are calculated. Susceptible Infected Quarantine Recovered (SIQR), a variant of classical SIR model is proved to be particularly convenient for the modelling of COVID-19 [5] . This model considers two categories for infected individuals, one who gets quarantine and others who don't (asymptotic or negligence). Susceptible in the models are the individuals who are at the risk of getting infected. Infected are those susceptible who gets affected by the virus; they may be asymptotic or have symptoms. Infected individuals who develop signs and gets isolated is considered as quarantine. Recovered are those infectious or quarantine individuals who recovered or died from the disease. In the equations mentioned above: denotes the rate of infection, determines the rate at which new cases are detected from the infected population. is the rate at which quarantines are getting removed (recovered or died). is the rate of removal of infectious individuals who are asymptotic and didn't get quarantined (for any reasons). N is the total size of the population. 'Flattening of the curve' a term which is widely used during this pandemic is nonetheless about optimization of these four rate constants. By imposing lockdowns, authorities are trying to minimise the rate of infection ( ) [ Fig. 2 ]. Rate of detection of new cases from the infected population ( ) directly attributes to the number of tests per million of the population. In India during the lockdown, authorities have tried significantly to minimise the rate of infections and while the gradual opening focus is on to increase the number of tests per million in population to maximise . Such models, no doubt, is used frequently to study the impact of large scale pandemics and proven to be beneficial for medical researchers and policymakers. However, like every mathematical modelling tool, this also works on a few assumptions. SIQR model considered that everyone in the population has an equal probability of getting infected. In addition to this, model reliability depends on the quality of data; it is assumed that the data available on the open platform is correct. Also, assuming a person infected and removed from the population cannot get reinfected, in a real scenario that can be a possibility [8] . Government future steps to control the pandemic is always uncertain like weekly lockdown, or partial opening in the future. Due to uncertainty in the future actions by authorities, it is challenging to make the future prediction of this pandemic using such mathematical tool which depends heavily on the previous data. For the same reason this work has concentrated on the analysis of the evolution of the infection This section of the paper is broadly divided into two parts: analysis of the available data and modelling of the pandemic to obtain pandemic parameters. Active cases (Q) in India has crossed 0.6 million and active cases per day have touched 20,000 during the start of last week of July [ Fig. 3 ]. It is interesting to note that after the last week of July, active cases per day is mostly less than 10,000 cases (and seems to be going down). If this trend continues, that may correspond to peak or some interim plateau. Doubling rate of the total positive cases on n th day is defined as the number of days required for the case on n th day to get double. From 4 days of doubling rate during the first week of March to 26 days in the last week of August, doubling rate is improved significantly [Fig. 4 ]. Recovery rate, i.e. defined as the ratio of the number of cases recovered on n th day to the total active cases on the same day. Recovery rate is increased from around 10% in the first week of March to more than 75% in the last week of August (Fig. 4) . India, during the initial days of the pandemic, was blamed for testing less among its population. Correlation coefficient between reported positive cases (Q+R) and the test per million [9] is determined [ Fig.3(c) ]. High positive correlation coefficient between positive cases on a day with the test on the same day is as expected. However, it is interesting to note, with more than 6 lakhs test per million this coefficient comes down to 0.72. Based on this, it can be concluded that the test of 6 lakhs per day would be sufficient to detect most of the infections. Modelling of the available data with the SIQR technique is done to get the parameters that quantify the growth of a pandemic. Integrating Eq. (5) to obtain (6) is the number of infected individuals at the start of the disease in the country. For modelling purpose, this study has considered the number of initial cases as , since cases were almost stagnant before that. Adding Eq. (3) and Eq. (4) will give us the rate of change of the total confirmed positive cases [Q+R] in the country with the day (t). Putting from Eq. (6) to Eq. (7) will give us - Integrating Eq. (8) over - Above equation (Eq. 9) is in the form of where is the number of confirmed positive cases on a day (t). This form of equation is fitted to the total positive reported cases at every eighth day (starting from 15-March) using least square fitting (Fig. 4) to give data of fitting parameter and with errors. Twenty-one fitted parameters at regular interval of 8 days starting from 15 th March till 22 nd August is calculated to get and for the calculation of parameters defined in the later sections. Determination of the rate of detection of new cases from infectious individuals ( is not straight forward due to insufficient data of asymptotes in the infectious Indian population. However, a range of value of ( from the studies based on the incubation period with the assumption on a fraction of infectious population is widely used [4, 5] . Rate of detection of new cases from infectious individuals depends not only to the incubation period but also to the fraction of infected individuals getting positive and quarantined just after the incubation period [5] . On average, it takes 6.93 days for an infectious individual to get symptoms in India [10] . If that individual gets quarantined on that day only, this rate can be written as , where is the fraction of the infected individuals that get quarantined. It has been reported that in Japan 50% of the population is asymptotic [11] . However, no similar studies can be found for India. In this case three different value of is considered to get the worst and best-case scenario. It is assumed that either 10%, 30% or 50% of the total infected population will get quarantined, i.e. can be considered as 0.1, 0.3 and 0.5 that corresponds to (0.043 and 0.072). Rate of removal of quarantine cases ( ) from the reported positive cases is calculated based on the data of the number of reported positive cases and number of removed (death + recovered) from the reported cases. is calculated based on the following equation. (10) Here, and is the total number of reported quarantine cases on a day and its previous day, respectively. is the total number of removed cases on the previous day. Average of last 8 days have been considered for the value of (with error) on a particular day [ Fig. 5 ]. All these model parameters are used to determine indicators that quantify the transmissibility of disease in the country [4] . These indicators are defined as follows, and its derivations are discussed in the appendix: a. Effective reproduction number ( ): This number is used to quantify the transmission ability of a disease. It is defined as the average number of individuals that can get infected from a single individual. It is formulated as follows: (11) Temporal evolution of effective reproduction number in India plotted in Fig. 6 . It can be noted that up to the last week of April, is on the higher side (higher than 2). From May, it started gradually decreasing but it is still greater than 1. This means a single person on an average is affecting more than 1 person. As it is known, for a pandemic to die down, this number should be less than 1. b. Epidemic doubling rate: Number of days required for a disease to double its infected population is termed as epidemic doubling time. It is different from the plot reported in Fig. 3(b) that is doubling of the reported positive cases (not the infected population). Epidemic doubling time can be given by following equation: Doubling time from less than 5 days in the second week of March is gradually increased to 23 days during the last week of August [ Fig. 7 ]. c. Infected to Quarantined ratio: This ratio gives us the estimate of the population which are infected but not quarantined. Infected population may be asymptotic or suffering from mild symptoms that get unnoticed. As the calculation of this parameter depends inversely on the calculation of , values corresponding to three different value of has been considered. This ratio depends inversely on the rate of detection of new cases from the infected population. Higher the testing rate lower would be this ratio. In recent findings from national serosurvey May-June 2020 [12] , it is reported that around 6.4 million are infected in total by early May 2020. However, in the same period, total reported active cases (Q) is about 50,000 (mid-May). It can be said that with every reported cases there were around 100-150 cases that went undetected in the population. Mathematically I/Q ratio determines a similar ratio. It can be deduced from Fig. 8 that at the end of August 2020, in the worst-case scenario there would be 50 times more cases then the reported active case (Q) and on the best side, this ratio could be 10. Effective reproduction number, Epidemic doubling rate and Infected to quarantined ratio discussed and calculated in the previous section can be useful to retrace the growth of pandemic in the country. Although the country's doubling rate is increasing, high effective reproduction number and infected to quarantined ratio confirms the spreading nature of the virus in the country. Prediction of the peak can be made using the numerical integration of eq. 1-4, but the focus of this paper is to discuss the method to determine the parameters that can be useful for medical researchers. Also, the prediction of peak done by many researchers in the past [7] has gone wrong. Accurate prediction of peak is challenging because of the difficulty in modelling the unpredictable government policies with the limited parameters in mathematical models. Susceptible-Infectious-Quarantined-Recovered (SIQR) model is used in this paper to estimate the parameters that can be used to quantify the temporal evolution of COVID-19 in the country. Effective reproduction number, epidemic doubling rate and infected to quarantined ratio is studied in this paper. India has successfully reduced the reproduction number from 2-3 to 1 by the last week of August, but it is still more than 1. In the worst-case scenario, there could be 50 times more infected cases with each reported active case. Serosurveys published from some part of the country has reported that there can be 150 undetected infections with evert detection. Epidemic doubling rate is improved significantly to more than 20 days. This mathematical modelling technique no doubt has been proved to be essential for the researchers, but it has limited applicability with some assumptions. This model can be used at state or district level to check the temporal evolution of pandemic. Weighted parameters can be added to the existing SIQR model to overcome the assumption of equal weightage of infection to each one in the population. Models can be trained with the data of different nations [13] using Machine Learning algorithms to make predictions accurate. A disease is said to be spreading if with time number of infections are increasing i.e. . Eq. 3 can be re-written as : Here, are function of time. Above equation is linear ODE of the form . Putting I from eq. 6 in the above equation: ∫ ∫ ⇒ ⇒ ⇒ ⇒ ACKNOWLEDGEMENT I wish to extend my special thanks to Prof. Manaswita Bose from IIT Bombay for regular discussion while writing and modelling for this paper. Also, I would like to thanks reviewers and the editor for valuable suggestion during the review process Infectious Diseases of Humans Effects of quarantine in six endemic models for infectious diseases Modeling the early evolution of the COVID-19 in Brazil: Results from a Susceptible -Infected -Quarantined-Recovered (SIQR) model Quantifying undetected COVID-19 cases and effects of containment measures in Italy COVID-19 epidemic: Power law spread and flattening of the curve Age-structured impact of social distancing on the COVID-19 epidemic in India COVID-19 re-infection by a phylogenetically distinct SARS-coronavirus-2 strain confirmed by whole genome sequencing COVID-19 India Incubation period and Reproduction number for novel coronavirus (COVID 19) infections in India National Institute of Infectious Diseases Prevalence of SARS-CoV-2 infection in India: Findings from the national serosurvey Early transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia