key: cord-0185110-9b49796l authors: Kounchev, Ognyan; Simeonov, Georgi; Kuncheva, Zhana title: The TVBG-SEIR spline model for analysis of COVID-19 spread, and a Tool for prediction scenarios date: 2020-04-23 journal: nan DOI: nan sha: de3c922efb6cd2e9d1a6cc9ab5e586a63eab3b4f doc_id: 185110 cord_uid: 9b49796l Mathematical models are traditionally used to analyze the long-term global evolution of epidemics, to determine the potential and severity of an outbreak, and to provide critical information for identifying the type of disease interventions and intensity. One of the widely used mathematical models of long-term spreading of epidemics are the so-called deterministic compartmental models (SIR/SEIR type models). One of the main purposes of applying such models is to assess how the expensive restriction measures imposed by the authorities (home and social isolation/quarantine, travel restrictions, etc.) can effectively reduce the control reproduction number of the disease and its transmission risk. However the classical SIR/SEIR models have been primarily studied in what may be called stationary case, where the main parameters, the Transmission rate Beta (reflecting the virus spread by infected individuals) and the Removed rate Gamma (reflecting the hospitalization/isolation measures) remain constant during the whole period of interest. Hence, it is important to extend the classical SIR/SEIR models by creating new ansatzes for the dynamics of the transmission rates Beta(t) (which we will call further just Beta) and removed rates Gamma(t) (which we will call further just Gamma). The main purpose of the present research is to introduce a spline-based SEIR model with Time-varying Beta and Gamma parameters, or abbreviated TVBG-SEIR model, which is used to estimate the practical implications of the public health interventions and measures. We have designed a Tool based on the TVBG-SEIR model, which may be used as a Decision Support Tool to assist the health decision- and policy-makers in creating predictive scenarios. However the classical SIR/SEIR models have been primarily studied in what may be called stationary case, where the main parameters, the Transmission rate β (reflecting the virus spread by infected individuals) and the Removed rate γ (reflecting the hospitalization/isolation measures) remain constant during the whole period of interest. This does not reflect in a proper way their extremely dynamical behavior during the COVID-19 (and similar) epidemic, resulting from the imposition of intensive restriction measures by the authorities. Hence, it is important to extend the classical SIR/SEIR models by creating new ansatzes for the dynamics of the transmission rates β(t) (which we will call further just Beta) and removed rates γ(t) (which we will call further just Gamma). The main purpose of the present research is to introduce a spline-based SEIR model with Time-varying Beta and Gamma parameters, or abbreviated TVBG-SEIR model, which is used to estimate the practical implications of the public health interventions and measures. We have designed a Tool based on the TVBG-SEIR model, which may be used as a Decision Support Tool to assist the health decision-and policymakers in creating predictive scenarios. It may be used to assess the impact of previous public health interventions, and to plan quantitatively and qualitatively the introduction of future containment measures for achieving the necessary objectives. Aim and Methods summary To estimate the dynamics of transmission β and removed rates γ during a controlled COVID-19 outbreak we have developed a mathematical model with time-varying β(t) and γ(t) rates, TVBG-SEIR, which was simultaneously fitted to two sets of data: the daily infected cases, and the daily removed cases. We use deterministic spline Ansatz: the transmission rates β(t) and the removal rates γ(t) are modeled by splines with two nodes -Node1, Node2 (the same nodes for both) -within the time interval of interestfrom Start Date until Today date. This Ansatz allows to properly model the dynamics due to the introduction of containment measures by the authorities. The purpose of fitting of the TVBG-SEIR model is to identify the nodes of the splines and the three values of β(t) and γ(t) on Node1, Node2, and Today date. It is assumed that β(t) and γ(t) are constant in the time interval [Startdate, Node1], and β(t) is monotone decreasing while γ(t) is monotone increasing function. A Tool was designed for the visualization of the results of the fitted model (the daily infected cases), and for creating prediction scenarios for the daily infected cases during the next two months, by controlling the future values of Beta and Gamma. It is available at the link: http://213.191.194.141:8888/notebooks/TVBG-SEIR-Spline-model_v3.ipynb?token=b5d97bfbd7dd062e47ee7ab51837e470a8c226743a4667ee The plan of the paper is as follows: In Section 1 we recall the deterministic SEIR model and introduce some notions and notations. In Section 2 we introduce the discretization of the SEIR model which is used in the algorithms. In Section 3 we introduce and provide all technical details of the TVBG-SEIR spline model. In Section 4 we provide an application of the TVBG-SEIR model to Bulgarian data, which are used to illustrate the work of the TOOL for prediction scenarios. In Section 5 we do the same for Italian data, and in Section 6 for German data. In Section 7 we describe the technical details of the Tool for prediction scenarios. In Section 8 we provide some recent references about models with time-varying transmission rates and their calibration. In the case of the usual seasonal flu the main parameters of the spread of the viruses are the transmission rate which reflects the power of the transmission of the virus from infected people to susceptibles, the recovery rate which is reciprocle to the recovery period (which is the sum of recovery to health + isolated sick people + mortality due to the sickness ), and the parameter which is the reciprocle to the incubation period. Due to the long incubation period and large number of asymptomatic or mild-symptomatic cases, COVID-19 has proved to be very insidious and requires intensive emergency measures from the authorities to reduce the transmission rate and to increase the recovery rate For comparison, in the case of the seasonal flu no intensive measures are necessary to be undertaken by the authorities. The authorities have introduced very strong restrictive measures which have essentially influenced the dynamics of the parameters and . For the majority of the states these measures have been introduced not only in one step but most often at least in two steps. It is very dependent on every society how fast will these measures be implemented in life. The measures are, for example, closing schools, pubs, restaurants, social meetings, wearing masks, etc. All they restrict the contacts among the people, and thus essentially change the dynamics of the and rates. It is important to assess how these expensive and resource intensive measures implemented by the authorities can contribute to the prevention and control of the COVID-19 infection, and how long they should be maintained, [12] , [13] . In order to meet the challenge of Controlled spread of the COVID-19 (and similar) epidemics, one needs to develop new mathematical models which describe better the reality. Some of the most widely used models in Epidemiology are the deterministic SEIR model ( [4] ). Based on it, in the present research we propose a new model TVBG-SEIR which incorporates a specific spline model for the time-varying transmission and removal rates. Compartmental models are a framework used to model in an adequate way the dynamics of infectious disease (see the Wiki article). The population is divided into compartments, with the assumption that every individual in the same compartment has the same characteristics. This framework has been developed for the first time in the paper of Kermack and McKendrick in 1927 [2] . One may use a deterministic approach using a system of ODEs or a stochastic approach which is more complicated. The deterministic approach is what we follow and has two main representatives -the SIR and the SEIR models. For a detailed and excellent introduction to the compartmental SIR/SEIR models we refer to the monograph [4] . We provide a short description of the deterministic SEIR model which will be the main approach in our research. The classical SEIR model is based on the consideration of four compartments, and which are described as follows: : its size is ( )the number of "susceptible" people at time . Usually at the start ( ) is the whole population of the country under consideration. It is supposed that nobody has automatic immunity against the virus, i.e. everybody is susceptible. : its size is ( )the number of "exposed" people at time -these are the people who are "virus carriers" but are not "virus spreaders"; the virus is in a latent form, and usually they do not show symptoms of sickness. For different viruses the incubation (latent) period is very differentfor the coronavirus it was recently statistically estimated that the average incubation period is days [3] . Not everybody in may become "virus spreader", i.e. move to the next compartment . Practically, the compartment does not enter the official statistics, and it is practically not an observable but is very important for a more adequate modeling of the dynamics of the virus spread. This compartment is missing in the simpler SIR model. : its size is ( )the number of infectious cases at time -these are the people who are "virus spreaders", majority of them show some symptoms, although they may not show any symptoms (asymptomatic). It is important to understand in the modeling that many people who are diagnosed positively are almost immediately hospitalized or quarantined, hence they go to compartment , but they have stayed in only until they have been diagnosed (and these are the official data which we obtain -( ) ). : its size is ( )the number of recovered or deceased (or immune) individuals, which are all called "removed". Normally they come from compartment after becoming healthy and no more virus spreaders. Officially these data are provided in a cumulative way. However, what data do we have at our disposal to fit the model? We do not have the "reality data" ( ), ( ), ( ), ( ). We have the official data ( ) which are the daily "new infected cases" with COVID-19, and these are normally people with serious symptomatic. These are the cases which have been tested and registered officially at the hospitals. The majority of them are almost immediately hospitalized or quarantined, hence, they are almost immediately moved from compartment to compartment However it is well known that for seasonal flu (and it is considered to be similar for COVID-19) the size of is much bigger than that indicated by the official data ( ), and we have the inequality We have also the officially announced data ( ) which contain the cumulative number of recovered cases, and the ( ) which is the cumulative number of fatalities. Although there is a lot of discussion about the quality of these data, it is approximately true that A main point of the modeling paradigm for COVID-19 (and similar virus infections) is that, for a certain segment of the society (in this case, the younger people), the infection symptoms do not differ essentially from a seasonal flu, hence the number of unreported cases (those which are in compartment but not in ( ) for every time ) may be much bigger, thus in the above inequalities more appropriate is to use the symbol " " , which denotes "much less". In the case of the seasonal flu it may be even times less. The main point of developing the compartmental deterministic SEIR model is to provide some tractable approximations ( ) ( ) ( ) ( ) to the above time series of the "reality data" ( ) ( ) ( ) ( ) The most widely used is the model based on a system of Ordinary Differential Equations with variables ( ) ( ) ( ) ( ) which is given as follows: Let us explain the notations: 1. Here the term ( ) ( ) expresses the rate at which new individuals (as a proportion of the total population size) are infected by the already infectious ( ) individuals. Here and further ( ) is called Transmission rate of the infection, which we call further simply Beta. 2. The coefficient ( ) is the Removal or recovery rate; it is determined by the reciprocle of the infectious period, after which either the person is recovered (and no more infectious) or dead (again, no more infectious). Here and further ( ) is called Removal rate, and we call it simply Gamma. 3. The coefficient is the latent rate, or the rate of "becoming symptomatic" (where is the average of the incubation period). In the present paper we use the constant value which represents a reasonable approximation, as the recent research shows, see [3] . The usual applications of the SEIR model are with constant rates ( ) and ( ) One assumes that the initial values ( ) ( ) ( ) and ( ) are given and the system is solved for the times where is an integer. It is assumed that the following "conservation" equation holds where is the total population in the country Obviously, after introducing equation (2), the fourth equation in (1) becomes redundant. In practice one uses a discretization of the continuous SEIR model. The following discretization of the SEIR model is very intuitive, and is in fact derived from the Euler method for approximate solution of the initial value problem (1): Here and are respectively the values of ( ) ( ) ( ) and ( ) on the day and the initial values for day are and The above system is iteratively solved for integers We assume that the size of the population remains unchanged (hence no usual birth and mortality are taken into account). Hence, the total sum of the above is assumed to satisfy (3e) which makes the fourth equation in (3) redundant. It is well known that the above Euler method for approximating the solution of (1) is less accurate than the Runge-Kutta which is widely used, see e.g. [11] . One has to note that the continuous model (1a)-(1d) and the above discrete approximation (3a)-(3d) have essential differences in the long-term behavior which has been the subject of much research. It is important to note that the qualitative properties of the solution to the differential equation and of the discrete equation differ essentially -the continuous case is simpler as usual. The SIR/SEIR models have proved to be very efficient in situations where the main parameters and are constants, in natural conditions, where no special control by the authorities is exercised, i.e. no intervention (containment) measures are undertaken to change the transmission and the removal rates in the course of the epidemics. This is very often the case with the seasonal flus where the medical authorities do not undertake actively special measures to restrict the social behaviour of the citizens. However due to the specific of the COVID-19 the situation has become more dramatic and it has required the interference of the governments in order to avoid the overloading of the National Health systems. The authorities have introduced very strong restrictive measures which have essentially influenced the dynamics of the parameters and . For the majority of the states these measures have been introduced not only in one step but most often at least in two steps. In view of the above it makes sense to seek for Mathematical models which try to model as best as possible the dynamical change of the parameters and We have decided for spline structure with two important breakpoints Node1, Node2 which reflect the control exercised by the authorities in the form of restriction measures. Also, it is natural to assume that inbetween the dates the control measures change the parameters ( ) and ( ) in a monotone way, i.e. ( ) is decreasing whereas ( ) is increasing. Technical description of the TVBG-SEIR spline model: 1. We denote the Start date by T1; this corresponds to a date when the first cases of COVID-19 are announced, eventually we may choose T1 to be a date when the steeper growth of the epidemic starts. We denote by T4 the end date (usually chosen to be Today). 2. We choose two interior nodes for the interpolation splines modeling the coefficients Beta and Gamma: Node1 = T2 and Node2 = T3. This corresponds to two steps of the introduction of Restrictive Measures imposed by the authorities of the country XX. Normally, the date T2 may be the First restrictive measures date, or a date close to it, and T3 may be the Second restrictive measures date, or a date close to it. 3. The model is supposed to reflect the natural expectation that once there are official restrictions they will implicate an essential change in the Transmission and Removed rates although not immediately. We assume that the Beta rate β(t) is monotone decreasing with the time, which corresponds to the natural expectation that the more restrictive the measures the smaller the Transmission rate. Respectively, the Gamma rate γ(t) is assumed to be monotone increasing, meeting the expectation that the stronger the measures, the bigger the removal rate. 4. We assume that β(t) and γ(t) are constant between the Start date T1 and the first node T2, i.e. β(T1) = β(T2) and γ(T1) = γ(T2). This corresponds to the "still" life of the society (without containment measures) when the Transmission and the Removal rates are nearly a constant. 5. To be more precise, the splines which we consider are not the usual polynomial, but the so-called Exponential splines depending on a parameter in the exponent, which makes a fast decay to the next target value of the Beta rates; respectively this makes fast increasing to the target value of the Gamma rates. This corresponds to the expectation that the speed by which the society switches from one level of the restrictive measures to another is relatively fast, and it is reflected by the size of the exponent we decide to choose. On the following Figures we see examples of the dynamics of β(t) and γ(t) rates: 6. An important property of the TVBG-SEIR model is that due to the above spline model for the Beta and Gamma parameters, where there is a fast transition to the next target value, a classical SEIR model with constant β(t) and γ(t) holds during larger sub-intervals. In particular, this permits to provide a reliable estimate of the Basic Reproduction Number (Ratio). 7. The Reproduction number (ratio) is a key variable for all models of epidemics, see [4] , [9] , [14] , [15] . Following [15] (formula (2.4)), for the case of the SEIR models with constant rates β(t) and γ(t), the Reproduction number is given by the formula where we have assumed that the natural birth and mortality rates are small and also equal. Due to the above remark, we may extrapolate the above formula for all time points of interest by putting: 16. The Figure below shows the fitting by the model curve R(t) of the recovered and fatalities data for Bulgaria: 17. One of the most important tests for the quality of the model is to see how good does the optimized model identify the dates of introduction by the authorities of the Restrictive measures. This is clearly demonstrated below on the data for Bulgaria. 18. Final remark about the "parsimonious" style of constructing the spline model: one has to avoid putting too many nodes in the splines since this will influence the stability of the model, and might cause overfitting, hence would spoil the predictive power of the model. Here we demonstrate how the Tool works in the case of Bulgarian data. The Tool will be described in detail in Section 7. We provide the visualizations of the model fitting which are available in the Tool. The thick red line shows (until Today = T4) the fitted model curve for the daily new infected cases I(t) and the blue stars are the official data Idata(t). The thin red line shows the prediction scenarios, after Today. If Coef1 is less than 1 then this means that the Beta measures are "weaker", and also, the smaller Coef1, the weaker are the Beta measures and they will reach a target value at the date T5, which is defined by the size of Coef1 (Note that Coef1 < 1 means that the Beta rate will be bigger!). In a similar way, if Coef2 is less than 1, then this means that the Gamma measures will be "weaker", and the smaller Coef2, the weaker are the Gamma measures (Note that then the Gamma rate will be smaller!). A target value (determined by the size of Coef2) will be reached at the date T5. 5. On the other hand, if Coef1 or Coef2 are bigger than 1, this means "strengthening the measures", resp. of Beta measures and Gamma measures in the period [T4,T5] to some target value defined by Coef1, Coef2. 6. The USER has further the possibility to decide what will happen after date T5 -to weaken (or leave the same) the Beta and the Gamma measures. This is decided by the choice of two coefficients -Coef11 for the Beta and Coef22 for the Gamma measures. Coef11 = 1 means that one retains the same level of the Beta measures; Coef22 = 1 means that one retains the same level of the Gamma measures. If Coef11 is bigger than 1 then this would relax the Beta measuresthe bigger Coef11 the more the relaxation. Coef22 makes the same for the Gamma measures. As we already said, presently it is urgent to consider SIR/SEIR models with time-varying ( ) and ( ) rates. Let us mention some research about solving an inverse problem for finding time-varying ( ) in a SIR model, for a fixed removal rate [10] , where the time-varying transmission rate ( ) is determined by the infectious cases. In [1] , the authors do research and provide further references of research on specific models for the transmission rate ( ) But these measures cannot be relaxed more Also, relaxing both measures (Coef11=Coef22=1.4) after 13-May-2020 will not be good, as seen from the Figure below according to our Model1 for Germany (with Fval=20.849*103 , where the maximum value of Fval is 118.482*103, hence the ratio max/min is about 6), one may relax safely just a little bit the Beta measures (Coef11 = 1.4), bringing to less than 1000 infected per month: 2 On the other hand, Model2 (which is somewhat less reliable since it fits a little worse the data, Fval=21.368*103) with both measures relaxed (Coef11=Coef22=1.4) gives again less than 1000 infected per day: References SIR model with time dependent infectivity parameter: approximating the epidemic attractor and the importance of the initial phase, HAL Id: hal-01677886 A Contribution to the Mathematical Theory of Epidemics The Incubation Period of Coronavirus Disease From Publicly Reported Confirmed Cases: Estimation and Application Modelling Infectious Diseases A dynamic model of bovine tuberculosis spread and control in Great Britain Impact of spatial clustering on disease transmission and optimal control Foot-and-mouth disease under control in the UK Early dynamics of transmission and control of COVID-19: a mathematical modelling study, The Lancet Transmission Dynamics and Control of Severe Acute Respiratory Syndrome Extracting the time-dependent transmission rate from infection data via solution of an inverse ODE problem Introduction to Numerical Analysis Estimation of the Transmission Risk of the 2019-nCoV and Its Implication for Public Health Interventions An updated estimation of the risk of transmission of the novel coronavirus (2019-nCov) How generation intervals shape the relationship between growth rates and reproductive numbers Perspectives on the basic reproductive ratio All authors acknowledge the support by grants DH-02-13 and KP-06-N32-8 of Bulgarian NSF. The first-named author acknowledges the partial support by Grant No BG05M2OP001-1.001-0003, financed by the Science and Education for Smart Growth Operational Program (2014-2020) and co-financed by the European Union through the European structural and Investment funds. Obviously, from practical point of view one has to take into account the predictions of the three models. And Model3 is also not that optimistic: