key: cord-0213547-dz498w2b authors: Alos, Elisa; Mancino, Maria Elvira; Merino, Ra'ul; Sanfelici, Simona title: A fractional model for the COVID-19 pandemic: Application to Italian data date: 2020-07-31 journal: nan DOI: nan sha: 6e7c8e870fe37f7692277fc2d456809fed6e9463 doc_id: 213547 cord_uid: dz498w2b We provide a probabilistic SIRD model for the COVID-19 pandemic in Italy, where we allow the infection, recovery and death rates to be random. In particular, the underlying random factor is driven by a fractional Brownian motion. Our model is simple and needs only some few parameters to be calibrated. Since the start of the COVID-19 pandemic many models have been proposed in the literature to explain its dynamics. Most of these approaches are compartmental models, where the population is divided into compartments that describe the different situations regarding the infectious disease (like susceptible, infected, etc.), and people progress between them. The basic reference compartmental model is the SIR model, where the population is divided into susceptible (S), infected (I) and recovered (R). As this model is too simple to describe the complexity of several epidemics, some extensions have been proposed in the literature. For example , the SIRD model considers also the compartment of deaths (D), and the SEIR introduces exposed (E). Some other classical models, like the SEIRS, also allow to model the lost of immunity after recovery. Some recent approaches pay special attention to non directly observable compartments as asymptomatic (A) (see, for example, [3] and the references therein) that, even not being directly observable, play a crucial role in the pandemic. Other extensions consider time-dependent parameters, as in [1] , where the kinetic of the rates of infection (β) and death (µ) are modeled by exponential functions, while the recovery rate (γ) is of logistic type. Recent literature also exploits stochastic models, where the main variables account for a noise (Brownian) component, or branching processes [2, 5, 6, 7] . In this paper, we introduce a probabilistic SIRD model, where we allow the coefficients to be stochastic processes determined by a few number of parameters. This randomness is a way to accommodate the effect of several unobservable factors, as the loose of immunity or the existence of asymptomatic. Our construction of the model is motivated by the descriptive analysis of the parameter time series, where we observe that the β returns have negatively correlated increments, an observation that suggests to model them by a fractional Brownian motion (fBm) with a Hurst parameter H < 1 2 [4] . We recall that this approach does not need to consider a drift for β, but this drift arises simply by the properties of the fBm. Once modeled β, we see that simple relationships between the evolution of diagnosed and infected, allow us to model γ and µ. The model is calibrated in such a way the mean paths of infected, death and recovered fit the corresponding observed values, as well as the variability of β, γ, and µ. Our approach is not only able to adjust observed data, but also to study the different possible scenarios according to its stochastic behavior. The paper is organized as follows. In Section 2 we present a descriptive analysis of β, γ and µ corresponding to the evolution of the pandemic in Italy ( https://raw.githubusercontent.com/ pcm-dpc/COVID-19/master/dati-andamento-nazionale/dpc-covid19-ita-andamento-nazionale. csv). Section 3 is devoted to present our stochastic SIRD model for the Italian COVID-19 outbreak. The model is calibrated in Section 4, while we simulate some different scenarios in Section 5. Finally, a discussion of the results and proposals of future research are presented in Section 6. Let us consider a stochastic SIRD model of the form where S, I, R, D = {S n , I n , R n , D n , n = 1, ..., 150} denote the number of daily observed susceptible, infected, recovered and death, N is the population size and β, γ, µ = {β n , γ n , µ n , n = 1, ..., 150} represent the rates of infection, recovery and death. As S/N ≈ 1 in all the data set, we consider the following simpler version of the above model Now we observe the behavior of this model in Italy in the period from the 24/2/2020 to the 28/7/2020. In Figure 1 we can observe the paths of I (totale positivi in the data set), R (dimessiguariti), D (deceduti) and the total number of diagnosed (I + R + D, totale casi). Now let us observe β, γ, µ. In Figure 2 we can see the behavior of β, that is a decreasing function of time. This fits what expected due to the lock-down, that reduced the number of contacts between individuals. In Figure 3 , we can see the corresponding increments (i.e., ∆β = β n+1 − β n ), that are more variable at the beginning of the sample, when β is higher. Moreover, in Figure 4 we can see the returns ∆β β that have a stationary mean. After deleting the outliers at days 115 and 119 (making them equal to zero), the corresponding autocorrelation function, see Figure 5 , reveals a negative correlation between increments. The empirical analysis suggests to model β as a process with negative correlated increments with a decreasing mean. This process is designed in Section 3. The time behavior of µ can be seen in Figure 6 . We observe that it is similar to the behavior of β, but with a delay. Moreover, Figures 7 and 8 show a clear relationship between the daily new diagnosed and deaths, with a delay of 4 days. 1 In Figure 9 , we can see a strong link between total diagnosed and deaths. In Figures 10 and 11 we can see that this relationship is linear. Then, as the number of new infected is given by βI, the daily increments of deaths (∆D) would be modeled simply as c µ β n−4 I n−4 , for some positive constant c µ . This leads to the following model for µ: Finally, let us observe the path of γ in Figure 12 . This recovery rate seems to move between an upper and a lower bound, that we can interpret as the rate under stressed conditions of the health system, and the rate when this system adapts to the new scenario and the epidemic curve decays. This is connected to the approach in [1] , where γ is assumed to be described by a logistic function, and the rate of recovery increases from a initial value γ 0 and it stabilizes at some higher level. A measure of the stress of the model can be defined as the quantity where (I + R + D) −30n is the average of the diagnosed people the last 30 days. If there are no new infections, the number of diagnosed remains stable, and then the above quantity is near 1. Otherwise, a high increment of new cases translates into a decrease of the value of this ratio. In Figure 13 , we can see the behavior of this process We can also observe in Figure 12 a big variance, as expected due to the fact that not everybody recovers at the same speed. Then we would like to add some noise, multiplying this time series by an adequate random factor, as we detail in the following section. We propose a stochastic SIRD model where randomness is embedded by modeling the parameters of infection (β), mortality (µ) and recovery (γ). In the following subsections, we will describe the model for each one of the parameters. As the return of β are negatively correlated, it is natural to construct a model based on the fractional Brownian motion (fBm). The fBm is a Gaussian process with stationary increments, which depends on a parameter H ∈ (0, 1) called the Hurst parameter. More precisely, a process . Except in the Brownian motion case H = 1/2, the fBm increments are correlated. Denote X n = B H n − B H n−1 and define ρ H (n) := Cov(X 1 , X n+1 ). Then we have Notice that this quantity is positive for H > 1 2 and negative if H < 1 2 . Then, a natural candidate to model β is given bŷ where B H is a fractional Brownian motion with H < 1 2 and k is a constant. Apart from the negative correlated returns, this process has also a decreasing mean, as we prove in the following result. Proposition 3.1. Consider a fractional Brownian motion B H with H < 1 2 and the process defined by (8). Then, provided β 0 > 0, E(β n ) is decreasing as n → ∞. The above product can be written as Now, we can compute the expectation using Isserlis Theorem. This gives us that where we have used that partitions P i exist only if i is even. Then, as k i is positive for all even i and all the ρ H (|m j − m k |) are strictly negative, E(A n ) is decreasing. This allows us to complete the proof. Even when the model (8) reproduces some empirical properties of β, a numerical analysis shows that it has to be modified before being an adequate model. More precisely, we can see in Figure 14 that the variability of the paths is high, and that the processβ can even become negative (see also Figure 15 ). Moreover, the estimated correlation between two consecutive increments ofβ does not coincide with the observed for the β returns. A way to reduce the variance of the paths is obviously to define a model of the type where β i , i = 1, .., m are given by As pointed out in 2, the linear relationship between diagnosed (with 4 days lag) and death, suggests modeling µ simply as µ n = c µ β n−4 I n−4 /I n . On the other hand, the observations in Section 2 lead to a model for γ of the type for some positive constants c 1 γ , c 2 γ and for some adequate Hurst parameter H γ . The above models for β, γ, and µ lead to the following stochastic SIRD model: Notice that only 7 parameters have to be calibrated are m, H, H γ and c β , c µ ,c 1 γ , c 2 γ . We see in the following section how the set of the first three parameters m, H, H γ , that define the driving processes of the model, is chosen based on empirical observations, while the last group is calibrated by means of a classical least squares method. The calibration procedure is as follows. • In a first step, we fix reasonable values of m and the Hurst parameters according to observed data, and • fixed m, H, and H γ , we calibrate c β , c µ ,c 1 γ , c 2 γ by a least squares method. Step 1 focuses on choosing adequate driving random processes for the model. As estimating with precision and robustness this group of parameters is not straightforward (see for example Glotter (2007)), so we simply proceed empirically. More precisely, we have seen in Section 2 that the maximum 4.2 Initial guess for c β , c µ , c 1 γ , c 2 γ and global calibration In order to be able to start the calibration of the other parameters, we will need to have an initial guess. To find the parameter c β , we are going to fit the averageβ against the realized β. This gives us a value of 0.32250809. For µ, as we have seen previously, we can obtain a nice estimation by doing a linear regression between the infected and death people. The slope ( 0.13904755) will be the initial guess for c µ . In the case of γ, in order to obtain the variables c 1 γ , c 2 γ , we minimize the distance between the averageγ and the observed γ, as well as the distance between the standard deviation of the returns ofγ and the observed γ. This gives us (0.03512639,0.5) as initial guess. Once we have the initial guess, we find the best possible parameters to minimize the distance between the infected, the death and the recovered time series as well as the empirical variance of the γ by OLS. Following the procedure in the previous section, we get the following estimation of the parameters of the model: Simulating the model and taking the corresponding mean paths we fit the observed data, as we can see in Figure 17 . The same analysis can be done for β, γ, and µ: In the present work, we have extended the classic SIRD model including stochastic parameters and we obtain an easy-to-calibrate pure probabilistic model. We have been able to reproduce the evolution of the parameters of the Italian outbreak using only seven parameters. The properties of the fractional Brownian motion and the relationship between the parameters of the model are able to reproduce the exponential and logistic trends observed for example in [1] . Therefore, we have been able to fit the empirical data as well as imitate the noise of the variables. One of the advantages of having a stochastic model is that we are capable of generating a wide variety of scenarios. One challenging problem is now to model the evolution after a lockdown. This translates into a new dynamics of β, probably driven by a fBm with H > 1 2 . Moreover, the comparison between different countries that applied different policies during the pandemic would be of great interest. Chinese and Italian COVID-19 outbreaks can be correctly described by a modified SIRD model Parameter Estimation on a Stochastic SIR Model with Media Coverage Mathematical modeling of the spread of the coronavirus disease 2019 (COVID-19) taking into account the undetected infections. The case of China Fractional Brownian motions, fractional noises and applications A stochastic model for COVID-19 spread and the effects of Alert Level 4 in Aotearoa New Zealand Stochastic Modelling of the COVID-19 Epidemic. 2020. Available at SSRN Stochastic modeling and estimation of COVID-19 population dynamics