key: cord-169288-aeyz2t6c authors: Runvik, Haakan; Medvedev, Alexander; Eriksson, Robin; Engblom, Stefan title: Initialization of a Disease Transmission Model date: 2020-07-17 journal: nan DOI: nan sha: doc_id: 169288 cord_uid: aeyz2t6c Approaches to the calculation of the full state vector of a larger epidemiological model for the spread of COVID-19 in Sweden at the initial time instant from available data and with a simplified dynamical model are proposed and evaluated. The larger epidemiological model is based on a continuous Markov chain and captures the demographic composition of and the transport flows between the counties of Sweden. Its intended use is to predict the outbreak development in temporal and spatial coordinates as well as across the demographic groups. It can also support evaluating and comparing of prospective intervention strategies in terms of e.g. lockdown in certain areas or isolation of specific age groups. The simplified model is a discrete time-invariant linear system that has cumulative infectious incidence, infected population, asymptomatic population, exposed population, and infectious pressure as the state variables. Since the system matrix of the model depends on a number transition rates, structural properties of the model are investigated for suitable parameter ranges. It is concluded that the model becomes unobservable for some parameter values. Two contrasting approaches to the initial state estimation are considered. One is a version of Rauch-Tung-Striebel smoother and another is based on solving a batch nonlinear optimization problem. The benefits and shortcomings of the considered estimation techniques are analyzed and compared on synthetic data for several Swedish counties. This paper is concerned with using publicly available epidemiological data for estimating suitable initial conditions for a large mechanistic general Susceptible-Exposed-Infectious-Recovered (SEIR) model of the Swedish COVID-19 outbreak. The model incorporates spatial communication between the Swedish municipalities, and also includes the Swedish demographics, thought to be an important factor for the impact of COVID-19 Keeling and Rohani (2008) . The viral contraction is driven by an infectious pressure as in Widgren et al. (2018) ; Engblom et al. (2019) . Fig. 1 provides an overview of the modeling approach and specifies the included compartments. The dynamics of the disease transmission are modeled by a discrete-state continuous-time Markov chain. A continuous state variable, the environmental compartment, is included to model the infectious pressure. The Markov chain model is implemented using the computational framework SimInf in R, Widgren et al. (2019) . To infer the model parameters, the aim is to utilize a Bayesian approach as it allows the use of empirical measures as prior knowledge of the model parameters. The problem of estimating the state vector of a dynamical system backwards in time is known as smoothing. An optimal (minimal variance) fixed-interval smoother for a linear time-invariant model under additive Gaussian noise assumption was derived in Rauch et al. (1965) . Since then, various methods have been devised for more general settings, including state-dependent Gaussian noise (Aravkin and Burke (2012) ) and non-Gaussian noise sources (Wang et al. (2020) ). In the present work, these two complications occur combined, as the process noise is Poisson-distributed rather than Gaussian, and also dependent on the plant state. Therefore, none of the approaches found in the literature is readily applicable here. Instead, to obtain a plausible solution fast, empirical initialization algorithms are developed and compared to determine which one is most suitable in the final setup. To establish ground truth, synthetic data produced by models of increasing complexity are utilized in the performance evaluation. The rest of the paper is organized as follows. First, the model initialization problem is formulated and the properties of the linear time-invariant model that is used to calculate the initial condition are explored. Then, three model-based approaches to solving the initialization problem are presented. Finally, performance of the considered approaches is evaluated on synthetic data and conclusions are drawn. The inputs to the Markov chain model are the parameters inferred from data and an initial chain state. The initial state consists of the epidemiological states in all compartments, including the hidden states, i.e., the exposed and asymptomatic carriers. To find a county-wise initialization, specific to the Swedish COVID-19 outbreak, the cumulative infected cases data reported by the Swedish public health agency were employed Folkhälsomyndigheten (2020a) . In Sweden, a full disease testing strategy was in effect until March 12, after which the testing was heavily restricted Folkhälsomyndigheten (2020b) . With full testing, we assume that the reported cases holds the true number of cumulative infected cases. An accepted standard in stochastic epidemiological modeling is to start simulations when the system has reached some (fairly large) threshold number Allen (2017); Giordano et al. (2020) . We used the threshold of a 100 reported cases which Sweden reached on March 6; the data up until March 12 can therefore be used for smoothing. The problem of estimating the infected, exposed, and asymptomatic populations at a given point in time (model initialization point) is therefore investigated, based on the data for cumulative incidence measured over a fixed time horizon. Thus, the problem at hand constitutes a fixedinterval smoothing problem. The remaining compartments of the Markov chain model do not influence the infected, exposed or asymptomatic populations and are therefore not included at present in the considered estimation problem. Epidemiological mathematical models are typically designed in terms of populations and face difficulties in capturing situations, when only a few individuals are infected. This is logically the case in the beginning of an outbreak. Besides, an epidemic is not readily recognized until the number of patients in the healthcare system becomes significant, thus making initial data scarce and unreliable. Yet, since disease transmission is a dynamical process, a mathematical model of it has to be initialized so that historical data for the observed output agree well with the output produced by the model. As there were no deaths from the disease and very few individuals were in intensive care prior to the chosen point of initialization, the measurements that are used as input to the Markov chain model cannot be used for the initialization of it. Instead, reported county-wise cumulative incidence from the period of February 4th to March 12th 2020 are utilized. As contact tracing was discontinued after this period, incidence data from later times are significantly less reliable. Since direct inversion of a continuous Markov chain is not easily apprehended, the following linear time-invariant approximation is utilized for the initialization of the model for each county, whereas the model states are lumped over the considered age groups. The latter simplification is introduced since the cases were few in the beginning of the outbreak and patient age was not specified in the data. The model is derived as a normal approximation of the Poisson distributed forward steps and formulated in statespace form as . is the discrete time corresponding to daily sampling and w k is the process noise sequence, whose properties will be clarified in Section 2.3. The state vector elements ] stand for the populations of the model compartments according to: The parameters of the model are specified below σ expected rate of transition from the exposed state, γ A expected rate of transition from asymptomatic state, γ I expected rate of transition from infected state, F 0 fraction of transition from exposed reaching the infected state; the remaining fraction reaches the asymptomatic state, F 1 fraction of transition from asymptomatic state reaching the infected state, The remaining fraction corresponds to the recovery from the disease (not included in (1)), β indirect transmission rate of the environmental infectious pressure, ρ infections pressure decay rate, θ A asymptomatic viral shedding rate, θ E exposed viral shedding rate. The parameters are positive and so are the elements of the state matrix F . Therefore, model (1) is also positive, i.e. the state vector belongs to the positive quadrant provided the initial condition x 0 and w k , k = 0, 1, . . . do. The latter condition restricts the distribution of the process noise. To obtain the parameter values for model (1), prior distributions for the Bayesian parameter estimation algorithm of the Markov chain model are utilized. The prior distributions are based on empirical data or published estimates. For parameter values from these distributions, the matrix F tends to have one eigenvalue with magnitude larger than one and is therefore unstable. This is expected, since exponential growth is observed during the early phase of a disease outbreak. Since the cumulative incidence is the only measured signal, the output of the model is where H = [1 0 0 0 0] , and v k is the measurement noise with zero mean and variance R k . The introduction of measurement noise is a matter of complying with the standard assumptions of Kalman filtering and not an actual model property. Model (1), (2) does not possess structural observability for the whole range the parameter values. Some combinations of parameter values sampled from the prior distribution make the observability matrix lose rank. In order to analyze the process noise covariance, each error vector w k is separated into two terms: w k = w 1k + w 0k , where w 1k describes the error of approximating the stochasticity of the full Markov chain model by the linear dynamics of (1), and w 0k captures any other model uncertainty, including both differences between the models (e.g. the spread between counties) and differences between the complete model and the true outbreak dynamics. The process noise covariance matrix Q k is split accordingly as Q k = Q 1k + Q 0 . The model uncertainty is assumed to be additive, independent of k, and uncorrelated between the components. Therefore, Q 0 is diagonal and constant. The evaluation of the approximation error covariance Q 1k is more challenging. When the Markov chain model is sampled, the distributions of the elements of w 1k are given by sums of Poisson processes that are shifted to have zero mean, and with variance that depend on the populations in the different compartments. The matrix Q 1k is thus state-dependent and evaluated to (3). To avoid confusion with pure time-varying case, the explicit notation is utilized Let I D = [0, d] define a finite interval of discrete time instants corresponding to the measurements y k , k ∈ I D , and m ∈ I D be the point of initialization of the Markov chain model. An estimatex m|d of x k | k=m defined by model (1) is then sought from the output data y k , k ∈ I D . The problem at hand was approached using three different methods, which are presented next. The Rauch-Tung-Striebel (RTS) smoother (Rauch et al. (1965) ) is a recursive method for solving fixed-interval smoothing problems. It is proven to be an optimal smoother, when the noise sources are Gaussian and independent of the system states, and lacks theoretical justification in the present case. Even stability properties of the RTS smoother are not readily guaranteed. However, as the results of Section 4 demonstrate, it can nonetheless be used empirically. The stability concerns are not critical as the estimation is performed with a discrete LTI model and on a finite time interval. The RTS smoother is a two-pass algorithm consisting of a Kalman filter that is run for the full interval in a forward pass, followed by a backwards pass, when the state estimates are smoothed. The Kalman filter equations that are solved recursively from the initial conditionsx 0 and P 0|0 arex wherex k|k−1 andx k|k are the a priori and a posteriori state estimates, P k|k−1 and P k|k are the a priori and a posteriori estimate covariances,ỹ k is the innovation, S k is the innovation covariance, and K k is the Kalman gain. Notice that the Kalman filter requires knowledge of the covariance matrix Q k for 1 ≤ k ≤ d. In the present case, the covariance matrix is not available, since it depends on the unknown states of the system. Therefore, the plant state is replaced by its estimate, and the covariance matrix Q k is approximated aŝ The a priori and a posteriori state and covariance estimates at each time are are saved for the backwards pass. Then the algorithm proceeds backwards from the last time point d. The smoothed estimatex k|d is calculated recursively via the equationŝ where C k = P k|k F P −1 k+1|k and P k|d is the smoothed estimate covariance. The problem of estimatingx m|d can be approached as an optimization problem and solved once, rather than recursively. The simplest setup is based on the linear relation between the measurement and the state (i.e. backcasting) and leads to the algebraic system where the properties of the noisew k will be elaborated upon in Section 3.3. The state estimation problem is then formulated asx where Optimization problem (4) can be solved using standard techniques for linear least squares. Furthermore, positivity of the state estimation can be enforced by using constrained least squares. The basic method presented above can be potentially improved through weighting by taking into account the correlation of the error termsw k . To this end, let S Q = {Q k } d k=0 , and define the matrix Ω(S Q ) by specifying its elements as Then, Ω(S Q ) is the covariance matrix of The error termsw k are thus neither uncorrelated nor homoscedastic, so the Gauss-Markov theorem does not apply to the ordinary least squares formulation in (4). If the process noise covariance matrices were independent of the system states, the best linear unbiased estimator would be obtained by including the covariance in the formulation aŝ Since the process noise is state-dependent in our case, the state estimation problem cannot be approached directly. The matrix Ω(S Q ) will be instead estimated. For this purpose, introduce the setŜ Q (x m ) of approximated process noise covariance matrices aŝ A simplified version of the estimation problem can then be expressed aŝ Since Ω(Ŝ Q (x m )) depends on x m , this problem is nonlinear. In this work, its solution is sought iteratively by applying Algorithm 1. Solve For the parametrizations of F that appear in this work, the observability matrix Φ that is utilized in solving the least squares problems of state estimation becomes numerically infeasible to calculate if m is too large. The reason for this is that F has eigenvalues that are significantly smaller than one in magnitude, so that repeated inversions result in very large elements in Φ. To avoid this problem, the number of elements that was included in the optimization formulations was limited. For parametrizations, where one eigenvalue of F is very close to zero, the solution was to remove the corresponding state through truncation, thus treating the state as identical zero. The approximationx k = F k−m x m in the nonlinear least squares formulation can also pose problems, when k is significantly smaller than m. For this reason, a simple regularization was implemented, wherex k is set to zero whenever any element of F k−m x m becomes negative. The three estimation algorithms introduced above were evaluated using two types of synthetic data. First, linear model (1) was used to generate the data, with the same Poisson-distributed state dependent noise sources as derived for the estimators. Then, the data were generated from stochastic simulations of the Markov chain model. In both cases, the models were simulated repeatedly over a time horizon of 42 days (d = 42), from identical initial conditions (distinct between the two cases) and with identical parameter values (identical between the two cases), that were randomly selected from the prior parameter distributions. The probability distributions of the state estimation errors for m = 30 were estimated by fitting kernel distribution and compared to each other. The state estimation was performed with the three algorithms for 100 realizations. The process noise covariance was calculated with the diagonal elements of Q 0 set to 0.1, and R k = 0.1. Measurements for indices k < 19 were neglected in the batch optimization approaches. The estimated distributions for all model states are shown in Fig. 2. The RTS smoother appears to perform the best, mostly through lower uncertainty in the infected population estimate. The main difference between the linear and nonlinear least squares formulations is the significantly higher uncertainty in the cumulative incidence estimation for the linear method. This makes sense as the linear method does not exploit the low uncertainty of the measurement of this state, that is encoded in the covariance model. In this case, 50 realizations were generated and data from the three counties that were subject to spread of the disease in the highest number of realizations (33, 33 and 31 respectively) were analyzed. To capture the larger model discrepancy, the diagonal elements of Q 0 were set to 2 and R k = 0.5. As above, indices k < 19 were neglected in the batch optimizations. The estimated estimation error distributions for the states I, E and A in the three counties are shown in Fig. 3 -Fig. 5 . Similarly to the case considered in Section 4.1, the RTS smoother is generally better at estimating the infected population. It is hard to draw conclusions apart from this from the plots, as the characteristics of the distributions vary between the counties. To investigate the effect of the initial estimation on the complete model, this model was simulated using estimated states as initial conditions. The estimated states from the three estimation methods for one realization of the simulation of the complete model were chosen. These are summarized in Table 1 . The complete model was simulated 50 times from each of the three sets of initial conditions, for 42 days. The probability distributions of the logarithm of the infected, exposed and asymptomatic populations, in the three counties listed in Table 1 , were then estimated using kernel distribution fitting. The results are depicted in Fig. 6 -Fig. 8 . The main conclusion that can be drawn from these results is that the variations between the considered estimation algorithms have limited effect on the states of the system in the end of the simulation, compared to the variations due the stochastic simulation. A greater variance in the states can be observed for the initial conditions generated by the RTS smoother compared to the other, but no Fig. 6 . Probability distributions of model states for Stockholm county according to simulation from estimated initial conditions. general conclusion regarding the initialization methods can be drawn from this, as the results are based on a single estimation instance. Three approaches to a fixed interval smoothing problem with the purpose of initialization of a larger epidemiological model have been compared; one based on the Rauch-Tung-Striebel smoother and two batch optimization methods. The non-Gaussian state-dependent noise in the model implies that standard approaches could not be used directly, instead covariance estimates were used in two of the methods. The results indicate that the smoother performs better than the other methods, despite the lack of theoretical justification of the method. Simulations from estimated initial conditions indicate that the effect of minor estimation errors is limited compared to the variations inherit to the stochastic simulation of the Markov chain model. This suggests that computational complexity, robustness and ease of implementation might be of greater importance than high accuracy, when the initialization algorithm is chosen. Fig. 8 . Probability distributions of model states for Vstra Gtaland county according to simulation from estimated initial conditions. A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis Smoothing Dynamic Systems with State-Dependent Covariance Matrices Bayesian epidemiological modeling over high-resolution network data Ny fas kräver nya instatser mot covid 19 Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy Modeling infectious diseases in humans and animals Maximum likelihood estimates of linear dynamic systems Maximum correntropy Rauch-Tung-Striebel smoother for nonlinear and non-gaussian systems SimInf: An R package for data-driven stochastic disease spread simulations Spatio-temporal modelling of verotoxigenic E. coli O157 in cattle in Sweden: Exploring options for control