key: cord-332618-8al98ya2 authors: Barraza, Néstor Ruben; Pena, Gabriel; Moreno, Verónica title: A non-homogeneous Markov early epidemic growth dynamics model. Application to the SARS-CoV-2 pandemic date: 2020-09-18 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2020.110297 sha: doc_id: 332618 cord_uid: 8al98ya2 This work introduces a new markovian stochastic model that can be described as a non-homogeneous Pure Birth process. We propose a functional form of birth rate that depends on the number of individuals in the population and on the elapsed time, allowing us to model a contagion effect. Thus, we model the early stages of an epidemic. The number of individuals then becomes the infectious cases and the birth rate becomes the incidence rate. We obtain this way a process that depends on two competitive phenomena, infection and immunization. Variations in those rates allow us to monitor how effective the actions taken by government and health organizations are. From our model, three useful indicators for the epidemic evolution over time are obtained: the immunization rate, the infection/immunization ratio and the mean time between infections (MTBI). The proposed model allows either positive or negative concavities for the mean value curve, provided the infection/immunization ratio is either greater or less than one. We apply this model to the present SARS-CoV-2 pandemic still in its early growth stage in Latin American countries. As it is shown, the model accomplishes a good fit for the real number of both positive cases and deaths. We analyze the evolution of the three indicators for several countries and perform a comparative study between them. Important conclusions are obtained from this analysis. -We present a mathematical model based on a new stochastic process described by a Pure Birth process. -The proposed model matches the subexponential growth on the early stage of an epidemic. -The mathematical expression of the cumulative case incidence and cumulative death curves is obtained, with a quite accurate fit in both cases. -The model contains two parameters, the immunization and infection rates. The behavior in time of those parameters allows to assess the evolution of the outbreak. -We obtain a new indicator, the mean time between infections. This indicator allows not only to monitor the epidemic growth but also to predict the peak of cases. The coronavirus disease transmitted by the SARS-CoV-2 appeared in China by the end of 2019. More than 17000000 of cases and 675000 deaths were reported since then. The lack of a vaccine and an effective treatment for this disease forced the governments to adopt lockdown actions in order 5 to protect the population and to slow down the spread of the outbreak. Despite those actions, health systems in several countries resulted overwhelmed. Many mathematical models were developed in order to predict the behavior of the outbreak in several ways, most of them based on the well known SIR (Susceptible-Infectious-Recovered) models. We present in this work a different 10 approach having the advantage of a rapid and clear interpretation of its parameters. We obtain this way a quite useful new indicator, the mean time between infections (MTBI). Modeling contagion has attracted the interest of statisticians for decades. These models are useful to model the spread of contagious diseases in a popula- 15 tion or plantation. A well-known example is the Polya urn model, where balls of different colors are extracted from an urn in such a way that when a given ball is extracted, not only is it put back but also a certain number of balls of the same color is added. The probability of getting a ball of that color in the next drawing is thus increased, modeling a contagious process. Contagion modeling 20 has been already applied in many areas of engineering, see for example [1, 2, 3]. The Polya stochastic process is obtained as the limit from the Polya urn model in a similar way as the Poisson process is obtained from a Bernoulli process. As it happens with the homogeneous and non-homogeneous Poisson processes, the Polya stochastic process can be described by the Pure Birth equation 25 2 (see chapter 17 of [4] ). The main difference between these two types of Pure Birth processes is that, in the Polya process, the event rate is not just a function of time but also of the number of individuals in the population. There are also cases where the event rate depends only on the number of previous events but not on time, like the Yule process. However, since the probability of births rises 30 due to an increase in the population while the probability of an individual birth remains constant, it does not describe a true contagion process. The stochastic mean of the Polya process is a linear function of time, as will be shown. This characteristic makes the Polya process unsuitable for application in many real cases of disease growth dynamics and engineering. Hence, 35 we propose a different model based on a Pure Birth process with an event rate that, like Polya's, depends on both the elapsed time and the number of previous events, but with a different functional form. We obtain this way a mean number of events that is a nonlinear function of time, with either an increasing or decreasing first derivative provided a given parameter is greater or less than 40 one. An important application of our proposed model is the study of the spreading of diseases. We obtain this way a process controlled by two competitive phenomena, infection and immunization, i.e. two opposing forces. Hence, we get a model with two parameters: the transmission and immunization rates. The mean value function shows different behavior according to whether the ratio between infection and immunization rates is greater or lower than one. In the former case, its second derivative is positive and the mean value curve is convex, whereas on the latter case its second derivative is negative and the resulting curve is concave. The limit case of infection/immunization ratio γ/ρ = 1 is also 50 possible, resulting on a mean value function that grows linearly over time. The evolution of the parameters shows how well the pandemic is being controlled, either by lowering the infection rate via isolation and quarantine measures or by increasing the immunization rate with a vaccination campaign. We obtain this way three useful indicators: the infection/immunization ratio, the immunization 55 rate and the mean time between infections (MTBI). Our model is quite suitable to be applied to early epidemic growth dynamics, where exponential models fail (this matter is discussed in [5] ). Since ours falls in the category of subexponential models, the results presented in this article support the idea developed in the cited reference. Another advantage of our 60 approach is that, as a subset of the subexponential cases, we are able to model cases with linear and sublinear (concave) cumulative case incidence curves, as it will be shown. As an interesting and quite important application, we apply our model to the recent SARS-CoV-2 pandemic as an alternative to the SIR models. Our 65 theoretical values match real data, which allows us to assess the evolution of the oubtreak and to predict its trend a few days ahead. The parameters can be easily estimated through simple methods, such as Least Squares, applied either to positive cases or deaths. We show applications to data from several countries with good agreement. This paper is organized as follows: Motivation and related work are exposed in Section 2, the proposed model is presented in Section 3, applications to the SARS-CoV-2 pandemic are developed in Section 4 and some discussions on the results we obtained are presented in Section 5. The final conclusions are exposed in Section 6. Our main motivation is to obtain a model that describes an epidemic outbreak at its first stage, before it reaches the inflection point in the case incidence curve, which is useful to monitor how contagion is spreading out. This 80 first stage corresponds to an R naught greater than one. Our model is inspired in the Polya-Lundberg process, which comes from the contagious urn model as explained in the previous section. Since the mean value function of the Polya-Lundberg process is a linear function of time (see Appendix B), we introduce a modification in the event rate in order to get a mean value function that grows 85 subexponentially with either positive or negative concavity as we observe in the early epidemic growth curves usually reported. There exists a wide variety of models that can be used to describe the evolution of an epidemic. Main standard is held by the so called compartmental 90 models, i.e., the family of SIR based models (SIR, SEIR, SIRS, etc. See for instance [6, 7, 8, 9] ). These models consist of a set of differential equations that take into account transitions between different compartments. The first classical SIR model was porposed by Kermack and McKendrick [10] . Due to the recent COVID-19 coronavirus pandemic, most of the scientific 95 community is dedicated to study its behavior, both by using SIR based models and introducing new ones. In [11] , a stochastic term is introduced in the system of differential equations to simulate noise in the detection process. In [12] the authors use an ARIMA based method to correct errors arising from the delay between the day the sample is taken and the day a diagnosis is made, which compartmental models can be found in [15] . In [16, 17] some well-known ma-110 chine learning and time series algorithms (like ARIMA and SVR) are used to learn from a present dataset and forecast the case incidence curve up to the following six days. Combinations of SIR models embedded with some random components was proposed in very recent articles, see for instance [18, 19, 20] . A theoretical framework to model epidemics by a non homogeneous birth-and-115 death process was proposed in [21] . An extensive analysis of epidemics at their early stage was performed in [5] . In this study, the authors propose a phenomenological model (see also [22, 23] ) and analyze the application of SIR based models with either exponential and subexponential behavior. Our approach is quite different. On one hand, our 120 model is stochastic whereas theirs is deterministic. On the other hand, we obtain our contagion model as a special case of a very general and well-known probabilistic model, the Pure Birth process. It is interesting to compare the differential equations the cited authors arrive with ours, which will be done in a future publication. Furthermore, and unlike them, we are also able to model 125 cumulative case incidence curves that grow with negative concavity. The model we propose is based on a Pure Birth process. These processes describe the evolution of a population where individuals can only be born. We are then modeling the epidemic spread as births in a population, where every 130 birth corresponds to a new infection case. Our novel approach consists of finding a proper incidence rate that describes the contagion phenomenon. We develop next the fundamentals of Pure Birth processes and our proposal. We propose to get the probability of having r infections in a given time t 135 from a Pure Birth stochastic differential equation. Pure Birth processes describe the behavior of a population where individuals can only be born and are not allowed to die. All the individuals are assumed to be identical. The probability of having r individuals in the population at a given time t, P r (t), is given by the solution of the following differential equation, see [4, 24] : (1) Assuming we have 0 individuals at time 0, we impose the following initial conditions on Eq. (1): P 0 (0) = 1 and P r (0) = 0 for r ≥ 1. To see that P r (t) is The process given by Eq. (1) is also markovian, where the number of individuals corresponds to the state of the system, S r (t). The only transitions 145 allowed 2 are S r (t) → S r (t + dt) and S r (t) → S r+1 (t + dt) with probabilities: The dependence of λ r (t) on t defines the type of process. If λ r (t) is a function of t only or a constant, then the process is non-homogeneous or homogeneous respectively, with independent increments in both cases. If λ r (t) is also a function of r, the process has dependent increments. From Eq. (1), the probability of having no births in a certain time interval greater than t − s given the population has r individuals by the time s is given by the well-known exponential waiting time: The mean number of infections in a given time is Under certain conditions (see Appendix B to verify those are satisfied by 2 Actually, other transitions have probabilities that are higher order infinitesimals. our model) we have the following differential equation for the mean value: Depending on the proposed function for λ r (t), the mean number of failures may or may not be easily obtained. Our formulation allows us to calculate the mean time between infections (MTBI), this is, the mean time between births in a Pure Birth process. From Eq. (3), we can predict the mean time between infections (MTBI) after r infected individuals were detected by the time s using a model with incidence rate λ r (t) as: Here, Z is a normalizing constant to consider cases where the probability of 155 having no infections in an infinite time interval is greater than zero and is given by Eq. (7) (see Appendix C for a full deduction): Eq. (6) gives the expected time to the next infection given that by the time s the infected population consists on r individuals, this is, the mean time between two consecutive detections. This indicator is, in most cases, U-shaped (see Fig. 1 ). An exceptional case is also shown in fig. 4b , and will be discussed later). It is expected to decrease at first, showing the acceleration of the spread (infections occur more often). When the cumulative case incidence curve reaches the inflection point and the epidemic starts to mitigate, MTBI shows a flattening portion (infections occur 165 at a constant rate). Then, due to the deceleration of the epidemic (infections occur less often) MTBI tends to get larger again. This curve resembles the bathtub curve, well known in Reliability Engineering, which is also obtained when this class of models is applied to reliability studies, see [25] . 3.2. Proposed incidence rate 170 We propose a two-parameter incidence rate given by the following expression: The parameters involved in Eq. (8), indicate how fast the outbreak is spreading. We can see that the γ parameter is related to the strength of the contagion effect, since it is a factor of the number of infections, and the ρ parameter is related to the outbreak mitigation, either by natural immunization or due to external measures. That is the way how these two parameters, once estimated, 175 are useful to monitor the disease progress and the impact of actions taken by health institutions. These actions can affect either the γ parameter via the population solation and quarantine, or the ρ parameter by, for instance, vaccination campaigns. As we will show, the ratio between both rates is the exponent of 9 time t on the mean value function (Eq. (11)), and it determines the curve's 180 concavity, provided it is greater or less than one. We can compare our proposed incidence rate with other formulations by rewriting Eq. (8) as follows: The numerator of the second term in Eq. (9) can be compared with the numerator of the incidence rate analyzed in [26] (Eq. 1: is the number of infections, β the transmission probability per contact and c the 185 contact rate) by replacing I(t) with r and β c with γ; our γ parameter can be interpreted as the transmission rate in the same way. In that proposal, the N (t) denominator corresponds to the total population, since the authors consider the per capita incidence rate. As an interesting remark, it can be seen that it also coincides with that of the Polya-Lundberg process. The first term in Eq. (9) is 190 equal to λ 0 (t), hence it can be considered as the initial incidence rate. On the other hand, following the usual behavior of ρ, this term rapidly vanishes with time. Being the ratio γ ρ the coefficient of r in the contagion rate (Eq. (8)), it takes into account both the infection and immunization rates, so it can be directly re- The evolution of these two parameters, the ratio γ ρ and ρ as a function of time, is indicative of how well the epidemic is being controlled by actions taken from health institutions. We expect a strong decrease in γ ρ , and in consequence, a strong increase in ρ. Our definition of the outbreak parameters deserves a detailed analysis. The well-known definition of the basic reproduction number R 0 assumes all the population is susceptible and no immunization is present. This would imply ρ to be zero; in that case, Eq. (8) reduces to λ r (t) = λ r. which is the event rate of another classic stochastic model, called the Yule 205 process. In that case, the event rate given by Eq. (10) grows linearly with the population size, which results in an exponential expression for the mean value function (see [27] for more details about the Yule process). This means that, if we allow our ρ to eventually be zero, the Yule process becomes a special case of ours, hence exponential growth can also be modeled. However, as it can be 210 seen in Table 1 , even the worst case scenarios (very sharp incidence curves) differ from exponential growth, being this another empyrical confirmation of Chowell's thesis presented in [5] . It should be remarked that our proposed rate (Eq. (8) (5), as demonstrated in Appendix B. From the incidence rate (Eq. (9)), we are able to get the exact solution of P r (t) (see Appendix A) and therefore we can show that M (t) is finite and Eq. As seen from Eq. (11), the mean value obtained from our model is a nonlinear function of time with a positive or negative concavity whether γ ρ is greater or lower than one, as shown in Fig. 2 . This expression is the functional form that For our proposed model, it is straightforward to obtain an expression for the MTBI by inserting Eq. (8) into Eq. (6). This leads to Eq. (12): Replacing r by the mean number of infections given by Eq. (11) we get Eq. (13) which is a useful conditional expression, as shown in Appendix C. In this section, we show the indicators obtained from several countries' re-235 ports. Data were taken up to early July 2020 from https://ourworldindata. org/coronavirus-source-data. For Asian and European countries, and also the USA, only data from the early epidemic stages were considered for analysis, this is, up to the inflection point. Real data and fitted curves are shown in Figs. 3 and 5. We fit Eq. (11) to the incidence and death curves of the recent CoV-2 pandemic in order to obtain the three mentioned indicators. The case incidence curves are shown in Fig. 3 , while In order to show that the proposed model fits well not just positive concavities as shown before, we consider the data from Uruguay, which presents Cumulative number of deaths curves are shown in Fig. 5 . The parameters and the coefficient of determination are shown in Table 2 . In this section we show the evolution of the γ ρ parameter obtained from our model as a function of time, for the analyzed countries. In order to obtain the evolution of this parameter over time, we perform several estimation runs with successive subsets of the dataset and record each estimate. This picture is indicative of how effective the measures taken by different 265 countries have been, or how fast the outbreak is reaching the mitigation stage. As the transmission rate approaches the immunization rate, this parameter reaches the value of 1 as a limiting case. Curves are depicted in fig. 6a . As previously discussed, the strength force that contains the outbreak is 270 determined by the immunization rate ρ parameter, and we expect it to increase due to actions taken by governments. Then, we analyze the evolution of this parameter in the same way as it was done for γ ρ in the previous section. Values of ρ as a function of time are depicted in fig. 6b. It is interesting to analyze the current behavior of the parameters for Latin American countries, since they are still at the first stage of the pandemic and where, with the exception of Brazil, the lockdown was relaxed and turned strict again more than once. Curves are depicted in Fig. 7 . spreading out and the inflection point has not been reached yet. This indicator is useful in order to estimate the moment when the curve will start to flatten. In Fig. 9 we depict the MTBI for China in order to see the flattening section 285 of this indicator as the cumulative cases of incidence reaches its inflection point. By the time we are writing this report (early July 2020), contrarily to Europe and the USA, Latin American countries have not yet reached the inflection point in the cumulative incidence curve, which can also be seen through the MTBI characteristics. This inflection point in the cumulative incidence curve 290 corresponds to a minimum in the MTBI indicator, as seen in Fig. 8 for Italy. It Table 3 . Cumulative Incidence Figure 9 : Mean time between infections and cumulative cases incidence for China. The well goodness of fit obtained for several countries shows that the spreading is governed by the same law. Although with different parameter values, the cumulative case incidence curve follows the same mathematical expression. Those parameters indicate the level of contagion and the effectiveness of the actions taken by governments. The model fits rather well the number of in-300 fected and deaths in the population previous to the inflection point, at the early pandemic growth stage, though its prediction beyond four or five days ahead is generally too pessimistic and overestimates the actual data. The MTBI minimum values indicated in Table 3 show an interesting result. in R that implements our proposed model, and which was used to obtain the shown curves, is available at [28] . It can be easily proved that Eq. (1) has a unique solution and is given by (see also [29] ): is a probability mass function provided that ∞ r=0 P r (t) = 1; this is shown by following the same steps as in section 4 of Chapter XVII from [4] (see also [30] , section 4.3.2), where it is proved for the case λ r (t) depends on r but not on t. The authors are working on a future publication where this property will be properly generalized to functions that depend on t as well. Recall our proposed functional form of λ r (t): Let the function µ r (t) be: It can be seen from Eq. (A.3) that µ r (t) = λ r (t) and µ r (0) = 0 ∀r ≥ 0. We now define the auxiliary function which does not depend on r. Note that µ ∆ (0) = 0. It is a fact that the following equality holds: The proof can be done by differentiating the right side of Eq. (A.5) to show it's indeed a primitive of the left side's integrand. Since the functions are also equal on t = 0, fundamental theorem of calculus yields that the equalty is valid for every t. Considering the initial condition P 0 (0) = 1 (the beginning population is 0), the probability mass functions are given by: This is demonstrated by induction over r, using Eq. (A.6) as inductive hypothesis and P 0 (t) from Eq. (A.1). In fact, We can rewrite Eq. (A.6) as: Using the fact hat µ r (t) = µ 0 (t) + rµ ∆ (t) and replacing µ 0 (t), µ 0 (0) and µ ∆ (t) by their expression yields: (A.8) It should be noted that, since the expression Eq. (A.1) is also valid for the case λ r (t) = ρ γ ρ +r 1+ρ t , we can define µ r (t) and µ ∆ (t) so that the same properties are achieved. Therefore, following the same steps we get the expression of the 365 pmfs for the Polya-Lundberg process. A similar procedure provides the pmfs for the Yule process. Multipliying Eq. (1) by r and summing we obtain Since the condition lim r→∞ r λ r (t) P r (t) = 0. (B.1) is attained for our model, we get ∞ r=1 r P r (t) = ρ 1 + γ ρ r 1 + ρ t P r (t) = ρ 1 + ρ t The same procedure that leads to Eq. (B.3) is also valid for the Polya-Lundberg process. Replacing λ r (t) by its expression and solving the differential 370 equation results in a linear mean value function. In the same way, with λ r (t) = λ r, we get an exponential mean value function, which corresponds to the Yule process. Appendix C. Mean time between infections calculation for the contagion model 375 We begin from the exponential waiting time (Eq. (3)), which we can rewrite as P (T r ≥ t|t r−1 ) = exp − Image segmentation and labeling using the polya urn model An Introduction to Probability Theory and Its Applications Mathematical models to characterize early epidemic growth: A review An Introduction to Mathematical Modeling of Infectious Diseases Infectious Diseases of Humans: Dy-405 namics and Control, Dynamics and Control Modeling the Spread of Infectious Diseases: A Review Modeling epidemics: A primer and numerus model builder 415 implementation A contribution to the mathemati-420 cal theory of epidemics Analysis of stochastic delayed sirs model with exponential birth and saturated incidence rate Llanovarced-Kawles,Álvaro Olivera-Nappa, Statistically-based method-430 ology for revealing real contagion trends and correcting delay-induced errors in the assessment of covid-19 pandemic Inference of the generalized-growth model via maximum likelihood estimation: A reflection on the impact of overdispersion Covid-abs: An agent-based model of covid-19 epidemic to simulate health and economic effects of social distancing interventions Agent-based modeling vs. equation-based modeling: A case study and users' guide Short-term forecasting covid-19 cumulative confirmed cases: 455 Perspectives for brazil L. dos 460 Santos Coelho, Forecasting brazilian and american exogenous variables Sir epidemics with stochastic infectious periods, Stochastic Processes and their Applications Dynamic behaviors of a twogroup stochastic sirs epidemic model with standard incidence rates A stochastic sir epidemic model with lévy jump and media coverage Approximation of epidemics by inhomogeneous birth-and-death processes A generalized-growth model to characterize the early ascending phase of infectious disease outbreaks Using phenomenological models for forecasting the 2015 ebola challenge Introduction to Stochastic Processes Increasing failure rate software reliability models for agile projects: A comparative study The abc of terms used in mathemat-505 ical models of infectious diseases Wiley series in probability and 510 statistics Introducing the non-homogeneous compound-515 birth process An Introduction to Markov Processes Authors declare no conflict of interest related to this article.