key: cord-1003133-6zk16v7u
authors: Oliver, Matthieu; Georges, Didier; Prieur, Clémentine
title: Spatialized epidemiological forecasting applied to Covid-19 pandemic at departmental scale in France
date: 2022-04-20
journal: Syst Control Lett
DOI: 10.1016/j.sysconle.2022.105240
sha: 2dea5775dd7c80d7fc0028852e3988096a9fd54c
doc_id: 1003133
cord_uid: 6zk16v7u

In this paper, we present a spatialized extension of a SIR model that accounts for undetected infections and recoveries as well as the load on hospital services. The spatialized compartmental model we introduce is governed by a set of partial differential equations (PDEs) defined on a spatial domain with complex boundary. We propose to solve the set of PDEs defining our model by using a meshless numerical method based on a finite difference scheme in which the spatial operators are approximated by using radial basis functions. Such an approach is reputed as flexible for solving problems on complex domains. Then we calibrate our model on the French department of Isère during the first period of lockdown, using daily reports of hospital occupancy in France. Our methodology allows to simulate the spread of Covid-19 pandemic at a departmental level, and for each compartment. However, the simulation cost prevents from online short-term forecast. Therefore, we propose to rely on reduced order modeling to compute short-term forecasts of infection number. The strategy consists in learning a time-dependent reduced order model with few compartments from a collection of evaluations of our spatialized detailed model, varying initial conditions and parameter values. A set of reduced bases is learnt in an offline phase while the projection on each reduced basis and the selection of the best projection is performed online, allowing short-term forecast of the global number of infected individuals in the department. The original approach proposed in this paper is generic and could be adapted to model and simulate other dynamics described by a model with spatially distributed parameters of the type diffusion-reaction on complex domains. Also, the time-dependent model reduction techniques we introduced could be leveraged to compute control strategies related to such dynamics.

Over the past year, the spread of Covid-19 has had more serious consequences than expected. This pandemic has brought about massive organizational changes in many countries, with repeated lockdowns and the use of remote solutions for work, study and personal life. It has also had massive impacts on hospital care services with the need to expand hospital capacity and postpone operations for non-Covid-19 patients. Finally, the pandemic has killed several million people around the world (see, e.g., Pifarr i Arolas et al., 2021) . One of the most difficult specifics to model regarding the spread of Covid-19 is the amount of asymptomatic infected individuals, as described, for example, in Al-Tawfiq (2020), which makes the pandemic difficult to track and has led to massive testing campaigns to assess the evolution of the pandemic and the effectiveness of health measures. The efficiency of different strategies of lockdown and testing has been studied (see, e.g., Berger et al., 2020) . The incubation period of more than 10 days resulted in a lag between the implementation of restriction rules and their effect on hospital admissions, and made the effectiveness of restrictions more difficult to measure. Forecasting the number of infections and modeling hospital needs is critical to pandemic management and has been the subject of numerous articles over the past year.

Since the start of the epidemic, many studies have been carried out to model and forecast the spread of Covid-19 (see, e.g., Jordan et al., 2021 , for a review on optimization studies). The asymptomatic infected individuals were typically taken into account by building compartmental models that are more detailed than the usual susceptible-infected-recovered (SIR) model (see, e.g., Liu et al., 2021; Khajanchi and Sarkar, 2020; Berger et al., 2020; Roques et al., 2020; Charpentier et al., 2020, and references therein) . These works focus specifically on the effect of lockdown on the propagation of Covid-19 by taking into account detected and undectected infected individuals as well as hospitalizations. Forecasting the number of infections of Covid-19 has been studied in Liu et al. (2021) by using a 3-step approach on the transmission rate during a pandemic outbreak with linear, then exponential increase in infections and finally an exponential decrease of the transmission rate due to J o u r n a l P r e -p r o o f Journal Pre-proof major public interventions and social distancing measures.

In this paper, we focus on deterministic models which can be interpreted as limits for a large population size of stochastic models (see, e.g., Britton et al., 2019) . Most of deterministic compartmental models encountered in the literature on COVID-19 pandemic consist of systems of ordinary differential equations (ODEs). Then a common way to tackle spatial variation in such systems is to define regional compartments on homogeneous geographical subareas and to add coupling terms (see, e.g., Gatto et al., 2020; Giordano et al., 2020; Guan et al., 2020, and references therein) . However, the increase in the number of parameters to describe such coupled systems of ODEs increase rapidly with the number of sub-areas. Then, PDE-based compartmental models is a natural alternative to describe more finely spatial heterogeneity. There exist a few papers analyzing COVID-19 pandemic with such models. Let us mention Viguerie et al. (2020 Viguerie et al. ( , 2021 ; Guglielmi et al. (2022) based on diffusionreaction compartmental models, or Bertaglia et al. (2021) (see also Li et al., 2021) in which the authors couple a transport dynamic of the commuter population at large spatial scales, based on kinetic equations, with a diffusion model for non commuters at the urban scale.

In the present paper, our aim is to propose a spatially distributed compartmental model, accounting for spatial heterogeneity of the dynamics while including several features of the recent COVID-19 outbreak. We also propose to exploit meshless numerical scheme properties to simulate our model on a complex spatial domain. Finally we rely on very recent time-dependent model reduction techniques to lower short-term forecast complexity. More precisely, we propose a spatialized version of a detailed compartmental model inspired by Charpentier et al. (2020) , we then restrict the spatial domain to the French department of Isère and propose a calibration procedure over the period of the first French confinement, extending the whole methodology exposed in Guan et al. (2020) . This procedure can be adapted to any scale, provided that corresponding data is available. It allows to reproduce the macroscopical behaviour of Covid-19 pandemic on different geographical areas. A preliminary step to the calibration and a second contribution of this paper is the simulation of the model. Our model is defined as a set of spatio-temporal partial differential equations over a complex domain, namely the French department of Isre in our study. We propose a meshless method based on a finite difference scheme relying on radial basis function approximation of the spatial operators (the RBF-FD method, see Fornberg and Flyer (2015) ) to solve it, as such an approach is known to be flexible to solve problems on complex domains. To the best of our knowledge, it is the first time RBF-FD is applied in the framework of epidemiology to study the spatio-temporal spread of a disease. Our method-J o u r n a l P r e -p r o o f Journal Pre-proof ology allows to simulate the spread of Covid-19 pandemic at a departmental level, and for each compartment.

However the simulation cost prevents from online short-term forecast. Therefore we propose as a final contribution of the paper to rely on reduced order modeling tools introduced in Bakhta et al. (2021) to compute short-term forecasts of infection number. The strategy consists in learning a time-dependent reduced order model with few compartments from a collection of evaluations of our spatialized detailed model, by varying initial conditions, initial times and parameter values. A set of reduced bases is learnt in an offline phase while the projection on each reduced basis and the selection of the best projection is performed online, allowing short-term forecast of the global number of infections in the department.

Finally, we would like to stress that the original approach we introduce in this paper is generic and not limited to the analysis of COVID-19 dynamics. Indeed, it could be adapted to model and simulate other dynamics described by a model with spatially distributed parameters of the type diffusion-reaction on complex domains. Also, the time-dependent model reduction techniques we introduced could be leveraged to compute control strategies related to such dynamics.

The paper is organized as follows. In Section 2, we present an extension of the classical SIR epidemiological model that models the infections, recoveries and hospitalizations spatially over a given geographical area of interest. In Section 3, we present the meshless RBF-FD scheme for solving the set of differential equations from model proposed in Section 2. In Section 4, the model is calibrated using hospital data from Covid-19 outbreak in the department of Isère, France. In Section 5, we use model order reduction techniques to build a surrogate model and extrapolate the number of infected individuals on a 10-day horizon.

In this section, we present the usual susceptible-infected-recovered (SIR) compartmental model, then we add compartments to adapt to Covid-19 specificities. Finally, we propose an extension of this detailed model to account for the spatial spread of Covid-19 pandemic on a geographical setting, in our framework the Isère department in France.

model describes a population of constant size, where each individual can be either Susceptible S to a disease, Infected I with the disease or Recovered R from the disease. The transmission of the disease is ruled by the following equations:

where β is the transmission rate of the disease and γ is the recovery rate.

More detailed models such as the SIRHU D +/− model in Charpentier et al. (2020) can be better suited to model Covid-19 pandemic as they account for detected and undetected infected individuals I + and I − , detected and undetected recovered individuals R + and R − as well as hospitalized individuals H, individuals receiving intensive care U and deceased individuals D. This approach is particularly useful in the case of Covid-19 because undetected infections and hospital saturation both play a key role in the handling of the pandemic.

The flow diagram of this model is sketched out in Figure 1 . (2) The dynamics of each compartment is modelled by the following equations:

This model was calibrated in Guan et al. (2020) to fit the Covid-19 outbreak at the regional scale in France. In this paper, we aim to enrich this model to better account for spatial non-homogeneity of Covid-19 propagation. 

where ∇ denotes the spatial gradient operator. Note that each of the compartments S, I − , R − and R + now feature a diffusion term k(x, t) which models local displacement of individuals inside geographical area of interest. On the contrary, we assume that individuals from compartments I + , H, U and D are isolated and do not transmit the disease. It means that we neglect, e.g., the spread of the virus by individuals tested positive (I + ) during eventual transportation to hospital. The diffusion coefficient k(x, t), that may depend on the position x and vary in time due to changes in sanitary measures, is modeled as follows:

where ρ(x) is the local population density, β(t) is the transmission rate and a dif f is a positive constant. It seems indeed reasonable (see, e.g., arguments in the french mobility report Atif et al. (2020) ) to assume that local movement is more intensive in high density regions with low sanitary restrictions. As a first approximation, we assume that parameter a dif f is constant, independent 8 J o u r n a l P r e -p r o o f Journal Pre-proof from compartment S, I − , R − or R + . This parameter will be calibrated with other uncertain parameters (see Section 4). Parameters a AB are derived from the probability of moving from compartment A to B and the average duration in compartment A. These variables are more precisely described in Guan et al. (2020) . Parameter λ 1 (t) corresponds to the proportion of undetected infected I − that are being tested positive, hence moving to compartment I + . This models virological testing such as PCR tests. λ 2 (t) corresponds to serological testing of recovered people from R − that are moved to R+ when tested. These parameters are time-dependent because the testing policies and capabilities have changed during the pandemic. Moreover they can also be used for optimal control, because individuals from I + isolate themselves and do not spread the virus. In this work, we considered that the serological testing was negligible: λ 2 (t) = 0. For the calibration presented in Section 4, virological testing was set to λ 1 (t) = 0.01 because there was very few testing during the period on which we calibrate the model. However, in the forecasting part of the study that is described in Section 5, this parameter is varied over a wide range of values to replicate different phases of the testing policies and capabilities. In this model, we do not account for the loss of immunity (which could be done by adding a transition from the recovered compartment to the susceptible one), this is because our modeling is done at scale of days to a couple of months, at which scale the loss of immunity is neglected.

In this work, we used the radial basis function finite difference (RBF-FD) method from Fornberg and Flyer (2015) to solve the set of partial differential equations defined by (3) in Section 2. To the best of our knowledge, this is the first implementation of the RBF-FD approach to simulate an epidemiological model. Meshless approaches are particularly well-suited to our framework because the domain on which we solve PDE's is a geographical area that has a complex boundary. In the following, we take N d points denoted (x i ) 1≤i≤N d in the interior of the domain. We also take N b points denoted (x i ) N d +1≤i≤N d +N b over the boundary. In the following of this section, we focus on compartment of susceptible individuals S. Note that all the compartments involved in (3) are treated in a similar way.

Radial basis functions (RBF) are functions φ(x k , x l ) that depend on the distance r = x k − x l between x k and x l . We will use the set of points 9 J o u r n a l P r e -p r o o f

The set of points indexed by I n (x i ) defines the RBF-FD stencil of x i . A local approximation of S around x i can then be obtained from the radial basis functions φ k (x) = φ(x, x k ) f or k ∈ I n (x i ). In this paper, to compute the approximation S a of S, we used the inverse multi-quadratic functions:

Equation (4) below gives a local approximation of S around x i :

Similarly, we denote X a the approximation of X ∈ {I − , R − , R + } whose dynamic is governed by reaction-diffusion PDEs of system (3).

The approximation of the variables around stencil points allows us to locally discretize the diffusion operators that appears in the set of equations (3).

Using (4), we can approximate the diffusion operator for variable S appearing in (3) as follows:

The sum can be written in matrix form

with B(x i ) containing the operator values at points in the stencil of

K is a n×n symmetric matrix containing the values of the radial basis functions at each couple of points in the stencil of x i ,

J o u r n a l P r e -p r o o f Journal Pre-proof S i (t) is a vector containing the values of S a at each point of the stencil of x i :

The same diffusion operator approximations as given by (5) hold for the compartment variables X ∈ {I − , R − , R + } governed by reaction-diffusion PDEs of system (3).

Boundary conditions can be defined by operator B c , leading for variable S to:

where S b (x, t) is the boundary term that is imposed on point x. Applying the same method as in the previous section to the boundary points, we obtain for points (

where K(x i ) and S i (t) are defined by (6) and (7) respectively, and Φ(x i ) is the vector containing the values of the radial basis functions between x i and the n nearest neighbors of

. Similar boundary operator approximations as given by (9) hold for other compartment variables X ∈ {I − , R − , R + } governed by reaction-diffusion PDEs of system (3).

Reducing the equations to a system of ordinary differential equations (ODEs) We finally apply (5) to the diffusion terms in (3) which leads, for points (x i ) 1≤i≤N d in the interior of the domain, to:

where F α is the source term of compartment α in (3). Additionally, for points x i on the boundary, the boundary condition for each compartment variable 11

. This set of equations with the boundary conditions can be summarized as the following algebraic-differential equations:

where G is the vector of boundary conditions for the compartment variables at each boundary point:

Z 1 d and Z 1 b are the vectors of variables S, I − , R − , R + evaluated at each point of the interior and boundary of the domain respectively, Z 2 is the vector of variables I + , H, U , D evaluated at each of the points (interior and on the boundary) of the domain, and matrices D 1 , D 2 , D 3 , D 4 contain the coefficients of the discretized diffusion and boundary operators using (5) and (9).

When applying Dirichlet conditions as proposed in what follows, we have D 3 = 0 and D 4 = I d , leading to the following system of nonlinear ODEs:

3.4. Application to the epidemiological model The methodology described in the previous section allows us to solve the set of differential equations (3) on a domain given a set of collocation points in the interior of the domain and over its boundary. In our application we want to model the spread of Covid-19 in the department of Isère, France. However our methodology is generic and could be applied to any geographical area of interest, provided the relevant data are available.

For generating the collocation points in the interior of the domain, we used a two-dimensional Halton sequence (see, e.g., Lemieux, 2009 ) defined on a square and only selected points in the interior of the region, as shown in Figure 2 . We thus get a low discrepancy sequence of N d = 1000 points in the interior of the domain. Each collocation point was given the population density value of the closest city taken from INSEE (2014). A polygon describing the boundary of the department was obtained from Lexman (2019) then interpolated to obtain N b = 800 points along the boundary.

As shown in Figure 3 , the population density is very heterogeneous over the domain, with the south-east being very rural with densities under 100 hab/km 2 and the north-east and center of the department being a lot more urbanised. We chose to apply null boundary condition on the boundary points x i with density ρ(x i ) < d min = 200 hab/km 2 . On the rest of the boundary we assumed that the population of each compartment was equal to the average population of that compartment over the domain. More precisely, for

At t = 0, the proportion of individuals in each compartment should be defined in every point of the domain, which represents 8 × N d = 8000 parameters. These proportions are not known everywhere nor for each compartment 13 J o u r n a l P r e -p r o o f Journal Pre-proof and thus have to be calibrated. To reduce the number of parameters to be calibrated, the proportions of infected and hospitalized individuals are modelled by a uniform distribution over the domain. For i ≤ N d , we assume:

where i 0 /2 and h 0 /2 are the initial proportions of infected and hospitalized individuals in the domain. Then we used a numerical integrator based on backward differentiation formulas to solve the stiff system of ordinary differential equations expressed in (11). The time step was set equal to one day.

In this section, we describe the calibration procedure we propose for our spatialized detailed model. It extends the procedure in Guan et al. (2020) , based on hospital data from Santé Publique France (2021).

The calibration procedure we propose for the spatialized model described by (3) is initialized with the calibration results of its non-spatialized version performed in Guan et al. (2020) .

We use data from Santé Publique France (2021), namely the number of hospitalizations H obs , of intensive care hospitalizations U obs , the death toll D obs and the number of recoveries from hospitalization R hobs . Variables H, U and D are already involved in the model description (3) . Time evolution of the number of recoveries R h from hospitalization is described by:

Because of the weekly fluctuations of the data, a rolling average of 7 days is applied to the data.

Following Guan et al. (2020) , we calibrate our model over the first wave of Covid-19 in France which took place between March 17 th and May 11 th , as sanitary restrictions remained stable during this period. Moreover, we use hospital data which are reliable data, as opposed to test data that are subject to changes in testing capabilities and policies.

Our model depends on numerous uncertain parameters that are presented in this section. Because of the amount of uncertain parameters , we only calibrate a selection of them, fixing remaining parameters at the value calibrated in Guan et al. (2020) as detailed hereafter. Except parameters related to the transmission rate β(t), the parameters we calibrate in this section are assumed to be constant over time. These parameters are disease-related. These parameters will not be re-calibrated for forecast. Only those related to time dependent transmission rate will be re-calibrated in Section 5.

The transmission rate is one of the most sensitive parameters to tune the model (see, e.g., Da Veiga et al., 2021), hence its parametrization has to be chosen carefully. We used the one proposed in Liu et al. (2021) with an additional non-zero asymptotic value β ∞ :

Note that under this form, the transmission rate only depends on time. This assumption is motivated by the fact that sanitary measures were the same throughout the region of interest during the first wave of Covid-19 in France, which corresponds to the calibration period we have chosen. This parametrization of β(t) is well-suited for our calibration period: after time τ , the transmission rate decreases exponentially at speed µ from it's original value β 0 to an asymptotic value β ∞ . An example can be seen on 

The diffusion coefficient k(x, t) appearing in (3) accounts for the nonhomogeneity of the transmission of the disease. It depends on a single parameter a dif f to be calibrated.

The parameters a XY appearing in (3) depend on transition times and probabilities. The detailed formulas to obtain a XY are written in Guan et al. (2020) . The transition times and probabilities coming into play are:

-p a , p h and p u which denote the respective probabilities of needing no hospital care, hospitalization and intensive care, -p hu , p hd , p ud which denote the respective probabilities of needing intensive care or dying while hospitalized or in intensive care, -N a and N s which denote the recovery times of asymptomatic and lightly symptomatic individuals with no hospitalization, -N ih , N hd , N hu , N ud , N hr and N ur which denote the average transition times between compartments I ± , H, U and D.

This totals 14 constants to be calibrated to account for recoveries.

As detailed in Section 3.4.3, the initial conditions are reduced to the initial proportions i 0 /2 and h 0 /2 of I − and H. If the calibration was done in a later phase of the pandemic, we could add initial proportions of recovered individuals (R ± ) as well as intensive care occupation (U ).

In the above, we presented the variables to fit and the corresponding French hospital data, the set of parameter values for the initialization, the time period over which we perform the calibration and finally all the parameters to tune, including initial conditions. In this section, we detail the optimization procedure and we present the results of the calibration.

As we use hospital data aggregated on the department of Isère to fit our model, we first compute H sim , U sim , D sim and R hsim by aggregating on the J o u r n a l P r e -p r o o f Journal Pre-proof department of Isère simulated data obtained from our spatialized model. We then define the loss function to be minimized over the set of parameters p as:

with t max = 55 days, which is length of the calibration period, which corresponds to the first lockdown in France.

For minimizing the calibration loss function (13), we use a stochastic optimization Python package noisyopt proposed by Mayer (2017) . We also give the solver bounds for each parameter equal to [v 0 /5, 5v 0 ] where v 0 is the nominal value obtained from Section 4.1.

Only a subset of parameters is calibrated with the aforementioned procedure. Parameters p a , p h , p u , p hu , p hd , p ud , N a and N s are kept to the value obtained in the calibration of the non-spatialized version of our model performed in Guan et al. (2020) : p a = 0.9, p h = 0.9, p u = 0.2, p hu = 0.001, p hd = 0.176, p ud = 0.3, N a = 7.82 and N s = 15. This last point is commented in Section 4.5.4. The optimal parameters resulting from the data fitting as obtained on Figure 5 are given in Table 1 . 

The calibration was performed under lockdown condition thus the transmission rate β(t) was assumed exponentially decreasing. This assumption does not hold in the general case; the evolution of the transmission rate can follow more complex variations as will be shown in the forecasting of β that is carried out in Section 5. We also assumed that the probabilities of moving from one compartment to the other and the average duration in each compartment remained constant regardless of the sanitary measures, as being more diseaserelated parameters. Note however that this assumption may not hold when studying subsequent variants of Covid-19 and their propagation in a partially vaccinated population, hence these parameters should be included in the calibration procedure for studying this later phase of the pandemic. 

As shown in Section 4, our spatialized model is able to reproduce the outbreak of Covid-19 at the scale of the department while giving a very localized information on hospital needs and infection number. However this model is computationally expensive compared to a non-spatialized model, this is why we propose to rely on reduced order modeling to fasten the online computations required for short-term forecasts of the infection number. In this section, we extend the model reduction tools introduced in Bakhta et al. (2021) to the treatment of our spatialized model. In the following of this section, we explain how to build a reduced basis for extrapolating variables β * obs and γ * obs defined in (14). We then detail how it is possible to use the reduced basis to compute online the n f orecast -day forecasts of the global infection number in the department of Isère from an observation period of n calib = 10 days.

In this section, we project our spatialized 8-compartment model into a time-dependent SIR model described by (1). This is done by aggregating the spatial output of each compartment over the domain and regrouping separate compartments in either S, I or R as follows:

Starting from an initial epidemiological state (I(t = 0) = i 0 , R(t = 0) = r 0 ), the evolution of the SIR model is ruled by parameters β(t) and γ(t). As a consequence, forecasting the total number of infections and recoveries over the domain can be done by extrapolating β(t) and γ(t). Let S obs (t), I obs (t) and R obs (t) denote the evolution of S(t), I(t), R(t) observed over a time period. We then define:

Under mild assumptions that are detailed in Bakhta et al. (2021) , equations (14) allow a simple SIR modeling of the observed data starting from the initial state (I 0 obs , R 0 obs ). Note that in (14), one can replace observation data by simulation data from a model that outputs S, I and R over time, leading to the definition of β * sim and γ * sim that we will use for forecasting as detailed in Section 5.5.2. It is possible to infer I obs and R obs from the data provided by Santé Publique France (2021), applying an adjustment factor (see Bakhta et al. (2021) [Section 4.1] for more details): I obs = f H obs , R obs = f R hobs + D obs , with f = 15 computed from the literature (Mizumoto et al., 2020; Di Domenico et al., 2020) and assessing the proportion of individuals needing hospitalization when infected with Covid-19.

The evolution of β * obs is shown on Figure 6 , on which one observes the influence of lockdown periods and sanitary measures. Note that β * obs takes negative values around July, 2020. This reflects the very low tendency of hospital data in Isère during the summer of 2020. During that period, the SIR dynamics did not fit well the hospital data, leading to negative values for β * obs . However this only happened in a short interval of time during which the pandemic was stable.

J o u r n a l P r e -p r o o f Journal Pre-proof Figure 6 : Evolution of β * obs obtained from hospital data between March 2020 and July 2021. We can see that the dates of the sanitary measures match the change of variations of β * obs . At the very end of the profile we can see the start of the "delta wave" that occured during the redaction of this work.

In order to fasten the online computations we built a reduced order model. A key stone in the reduction of the model is the construction of a reduced basis from a set of evaluations of the detailed model. We built K different reduced bases (B k ) 1≤k≤K for extrapolating the short-term evolution of β obs and γ obs , by moving the initial time. We then selected the most suited reduced basis and used it for actual forecasting. Note that for the construction of the reduced bases, the detailed model described in (3) was slightly modified by considering that β and λ 1 do not depend on time. Then the model was run with different set of initial conditions and parameter values. The range of values we considered for initial conditions and for parameters β and λ 1 is provided in Table 2 . Even if the bounds for each parameter were chosen to be consistent with the sanitary condition at the start of the second wave of Covid-19 in France (September, 2020) , the idea is that the range of admissible values is chosen wide enough to match different calibration periods. Each parameter was sampled from a Halton sequence of length n runs = 500. The use of a Halton sequence allows to better explore the interval in which each parameter is defined. Other parameters were fixed from Table 1 in Section 4.5.3. Then, the detailed model was run with these n runs combinations of parameter values over a period of n eval = 200 days, producing a set of detailed outputs -the number of individuals in each of the 8 compartments over time -that are collapsed into SIR outputs using the following rules:

For each run, we computed a realization of (β * sim ,γ * sim ) using (14) on the SIR outputs. The set of realizations of β * sim and γ * sim are respectively denoted B and G in the following sections. The infection number over time is shown in Figure 7 for a selection of 50 sets of parameter values. In Sections 5.3 and 5.4 we detail the construction of the reduced order model from the set of realizations of (β * sim , γ * sim ). On most of the simulations in Figure 7 , we observe a wave growing over 50 days after the start of the simulation, as the bounds in Table 2 were chosen to be coherent with the sanitary conditions in France at the start of the second wave in France (September, 2020). However, recall that we widen the bounds in order to build a more flexible set of reduced bases, which explains why we observe other scenarios, such as simulations with a wave starting around 110 days after the starting time, or even simulations without any wave.

Even though Figure 8 was obtained from detailed model evaluations with β not depending on time, the projection to a simple SIR model of the output as described in (15) led to a set of realizations of time-dependent β * sim .

From the set of realizations of (β * sim , γ * sim ) shown in Figure 8 we built respective sets of functions B and G defined on n eval = 200 days for β and γ that can reproduce the typical variations of the detailed model. We used two different methods for building B and G: one based on Singular Value Decomposition and another one on a greedy algorithm. In the following, we detail the construction of B from B, the same steps were applied to construct G from G. (14) to the SIR trajectories outputted by the detailed model.

This approach consists in discretizing the functions in B at a 1-day timestep. We then obtain a n eval × n runs matrix denoted M β containing all the β * sim in B. We then compute eigenvalues and eigenvectors of the matrix A := M T β M β . The n eval positive and ordered eigenvalues of A are shown in Figure 9 . Then B SV D is composed with the n span eigenvectors with the largest eigenvalues. The basis we obtained for n span = 4 is shown on Figure 10 . We notice on Figure 9 that the magnitude of the 10 largest eigenvalues is significantly larger than the rest, which is why with this approach we never used more than 10 functions to build B SV D afterwards.

As can be seen on Figure 10 , the elements in B SV D are not constrained to remain positive. Section 5.3.2 tackles this issue by constructing in a greedy way a basis of positive functions.

The method is based on the algorithm called Enlarged Nonnegative Greedy (ENG) presented in Bakhta et al. (2021) . We first use a greedy algorithm to build a set A of nonnegative functions selected from B. This set is initialized as follows: a 0 = argmax a∈B ( a 2 ).

Then, while A has k ≤ n span elements (A = {a 1 , . . . , a k }), we add the function a new ∈ B that has the maximal distance from Span (A):

We thus obtain a set of n span nonnegative elements of B that will be used 24 J o u r n a l P r e -p r o o f Journal Pre-proof to build the set B EN G using the enlarge cone algorithm from Bakhta et al. (2021) :

Result: B EN G set of nonnegative representative functions (ψ i ) 1≤i≤nspan Initialization; basis of nonnegative functions A = (a i ) 1≤i≤nspan , tolerance > 0; for i ∈ {1, . . . , n span } do σ j i = 0 f or j < n span for k ∈ {1, . . . , n span } do

Note that, due to numerical integration errors, some of the curves in Figure   25 8 take a few values below zero. However in practice, applying Algorithm 1 to these functions, we obtained positive functions in B EN G as can be seen on Figure 11 . The same happened for G EN G (defined from G as B EN G from B). 5.4. Building a set of reduced bases for short-term forecasting by moving the initial time In this section, we detail the way B (obtained from SVD or ENG) was used to build a set of reduced bases for extrapolating β * obs .

Given a set B = (b i ) 1≤i≤nspan of functions defined on n eval days, we extract a set of reduced bases (B k ) 1≤k≤K by applying a sliding window to these functions. More precisely, we first choose a window step a of 5 days. Then, for 1 ≤ k ≤ K = E( n eval −(n calib +n f orecast ) a ) + 1, with E(x) the integer part of x, we denote B k the set of functions b k i corresponding to the restriction of b i to the window [(k − 1) × a, (k − 1) × a + n calib + n f orecast ], for 1 ≤ i ≤ n span .

In Section 5.4.1 we built a set of reduced bases (B k ) 1≤k≤K for extrapolating β * obs from the reduction of our detailed model evaluations. Additionally, we can add well-chosen functions to these reduced bases thanks to prior knowledge of the evolution of β * obs . As can be seen on Figure 6 , the variations of β * obs are mostly exponential. Starting at time t 0 and given a n calib -day evolution of β * obs , we can fit the following exponential function:

J o u r n a l P r e -p r o o f Journal Pre-proof to β * obs ([t 0 , t 0 + n calib ]) using least squares and add the function E f it ([t 0 , t 0 + n calib + n f orecast ]) to the reduced basis. This is illustrated in Figure 13 . In the following, this raw exponential extrapolation will be used as benchmark extrapolation of β * obs and γ * obs or in combination with the reduced bases obtained from B SV D and B EN G . 

We now have K potential candidates for the reduced basis of β * obs extrapolation. In this section, we present the way we selected the best reduced basis, and then how we used it to get a forecast on the infection number.

In order to compute a forecast on [t 0 + n calib + 1, t 0 + n calib + n f orecast ], we divided the calibration period in T 1 = [t 0 , t 0 +n 1 −1] and T 2 = [t 0 +n 1 , t 0 +n calib ]. Then for each reduced basis B k = (b k i ) 1≤i≤nspan , we defined the loss function:

By optimizing it, we computed a function β k calib (t) = nspan i=1 a * k i b k i (t) defined on [t 0 , t 0 + n calib + n f orecast ] fitting at best β * obs on T 1 . We then computed:

J o u r n a l P r e -p r o o f Journal Pre-proof and selected the function β k calib with the lowest evaluation of the loss L 2 k for extrapolating β * obs on [t 0 + n calib + 1, t 0 + n calib + n f orecast ].

By applying the methodology presented in Section 5.5.1 to the parameter β, we obtained a calibration-forecast function β calib as presented on Figure 14 . The methodology was applied in parallel to the parameter γ. From β calib and γ calib , we could run the time-dependent SIR model defined by (1) with initial conditions (I obs (t 0 ), R obs (t 0 )) and parameters β = β calib , γ = γ calib . The model output on the infection number is shown on Figure 15 . 

To evaluate the error of our models, we generated 10-day forecasts every 5 days between March 2020 and May 2021. In this section and in the following, the forecast error for a calibration starting at t 0 is:

We compared the SVD and Greedy algorithms with and without the addition of an exponential extrapolation of β * obs , the number of basis elements was set to n span = 5. The average error on the full timeframe are shown on Figure 16 . The ENG and SVD algorithms performed very similarly, and the exponential enrichment of the reduced bases did not improve significantly the forecasting Figure 15 : Forecast of the infection number in Isère department using the SVD reduced basis with n span = 4. Figure 16 : 10-day forecast error for the greedy algorithm (EN G), the enriched greedy algorithm (EN G exp ), the SVD algorithm (SV D), the enriched SVD algorithm (SV D exp ) and the exponential benchmark algorithm (exp). We can see that the enrichment does not improve significantly the forecasting score, and that the SVD and ENG algorithms perform very similarly.

score. Therefore in the following we computed 10 day forecasts with the SVD algorithm, as it is much faster. Also, in the following sections, we did not use the exponential enrichment.

In this section, we chose to focus on the SVD algorithm that we evaluated on 10-day forecasts. The choice of n span is crucial because keeping fewer basis functions may not be sufficient to capture the complexity of the problem while taking a too large span of basis functions could lead to overfitting over the calibration and deteriorating the forecast accuracy of the reduced basis. We computed the forecast error on the infection number during a full year of 10day forecasts for models built with 1 to 8 SVD basis functions in the reduced bases for both β and γ. The average output of these 8 models was also used as a forecast model, the so-called aggregated model, denoted by Agg hereafter. The error of each of these 9 models was computed using the forecast error introduced in Section 5.6. The results are shown on Figure 17 for the SVD algorithm. We can see that our aggregated model performed better than any individual model, and that this model reached a 5.5% error on 10 day forecasts. The visualization of some individual model predictions can be seen on Figure  19 . Moreover, the aggregated model seems more robust than the exponential extrapolation, preventing from very large errors as can be seen, e.g., on Figure  18 .

To evaluate the efficiency of the reduced modeling approach for forecasting, we compared its cost with the one consisting in forecasting the infection number directly from the detailed model. All the computation times mentioned below were obtained on a regular laptop. In the reduced basis approach, the offline phase consists of n runs = 500 evaluations of the detailed model over n days = 200 days. The detailed model runs in τ = 0.15 s.day −1 , which totals C of f = n runs n days τ = 15000 seconds. The online phase has a cost of C on = 26 seconds, which is the time for projecting β * obs and γ * obs onto the reduced bases (B k ) 1≤k≤K and (G k ) 1≤k≤K , as described in 5.5.1. The cost of running the reduced model itself is negligible because it is run only once, with the optimal extrapolations of β and γ. Hence, the total cost of the reduced model approach for κ forecasts is κ C on + C of f .

In the direct approach, the detailed model is evaluated over n calib +n f orecast = 20 days and the optimization of the 9 parameters presented in Section 5.2 required n eval = 600 evaluations, totalling a direct time of C = (n calib + n f orecast ) × n eval × τ = 1800 seconds per forecast. For κ forecasts, the gain is G(κ) = κ C C of f +κ Con . With κ ≈ 10 forecasts, the cost of the offline phase is compensated by the gain in the online phase.

This study has shown that extrapolating β * obs and γ * obs using the exponential function as presented in 5.4.2 is a reasonable solution for forecasting the infection number at a 10-day timescale. However our approach led to a slightly better forecasting score with a model that is more robust, hence more reliable. We would like to point out that the forecasting score has an underlying error due to the changes in dynamics of the epidemic with sanitary measures that are taken. In fact, the model has large overestimations of the infection number every time a new measure is taken, as can be seen on the bottom plot of Figure 18 . As a consequence, the forecasting error could not be reduced to a very low value without the introduction of the sanitary measures in the model. The model would perform better with no changes in the dynamics, however this model is still very useful because it gives the evolution of the pandemic if no measures were taken, which can help to measure the efficiency of the restriction rules. Figure 18 : Comparison of forecasts obtained with the aggregated and exponential models at periods where the dynamics of Covid-19 changed quickly. We can see that the exponential extrapolation of β * obs is more prone to exploding forecasts while the aggregated model has a more robust behaviour. Additionnaly, the third plot shows an example of a change in the dynamics that creates a large error as the forecasts can not anticipate a brutal change, e.g., in sanitary restrictions. 

In this paper, a spatialized extension of a detailed multi-compartmental epidemiological model has been proposed, allowing to reproduce the evolution of hospital needs at the scale of the department of Isère in France during the Covid-19 outbreak, while giving a detailed information about the geographical heterogeneity of the sanitary conditions. The partial differential equations that define the model were solved with a very efficient meshless scheme on a very irregular domain. From this model we built a reduced basis for extrapolating the transmission rate β and recovery rate γ obtained from hospital data and we used them in a surrogate model to output forecasts of the global infection number in the department of Isère. A simple exponential extrapolation proved to be efficient for the extrapolation of the transmission and recovery rates but the aggregation of surrogate models using different reduced bases gave a more robust forecast. The work presented in the present paper could be used to evaluate the efficiency of restriction rules that are taken by comparing the forecasts as a reference for the evolution of the pandemic with the actual evolution of the infection number when restriction rules are taken. Also, testing policies could be evaluated by controlling parameters λ 1 and λ 2 and by evaluating the dynamics of the pandemic while more people are being isolated after being tested. Note that the spatialized detailed model we introduced could be completed to account, e.g., for vaccination or emergence of a variant. We could add compartments for the vaccinated population by calibrating their own probability of transmission and severity of symptoms. In particular, this could allow to compare different vaccination policies. The spatialized model could take into account age categories using contact, testing and vaccination rates as well as symptom severity in each age category. This could be solved with the same approach, but would benefit from more precise hospital data involving the age of patients for calibration.

Asymptomatic coronavirus infection: MERS-CoV and SARS-CoV-2 (COVID-19)

Years of life lost to COVID-19 in 81 countries

Report 1: Feedback on mobility during the COVID-19 epidemic

Epidemiological forecasting with model reduction of compartmental models. application to the covid-19 pandemic

An SEIR Infectious Disease Model with Testing and Conditional Quarantine

Spatial spread of COVID-19 outbreak in Italy using multiscale kinetic transport equations with uncertainty

Mathematical Models in Population Biology and Epidemiology

Stochastic epidemic models with inference

COVID-19 pandemic control: balancing detection policy and lockdown intervention under ICU sustainability

Basics and Trends in Sensitivity Analysis: Theory and Practice in R

Impact of lockdown on COVID-19 epidemic inÎle-de-France and possible exit strategies

Solving PDEs with radial basis functions

Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures

Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy

Transport effect of COVID-19 pandemic in France

Delay differential equations for the spatially resolved simulation of epidemics with specific application to COVID19

Population des villes de France (Population of cities in France

Optimization in the context of COVID-19 prediction and control: A literature review

Forecasting the daily and cumulative number of cases for the COVID-19 pandemic in India. Chaos: An interdisciplinary journal of nonlinear science 30

Quasi-monte carlo constructions, in: Monte Carlo and Quasi-Monte Carlo Sampling

Carte des départements (map of the departments

Prediction of Epidemics Dynamics on Networks with Partial Differential Equations: A Case Study for COVID-19 in China

Predicting the number of reported and unreported cases for the COVID-19 epidemics in China

Noisyopt: A python library for optimizing noisy functions

Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship

Impact of lockdown on the epidemic dynamics of COVID-19 in France

Données hospitalières relatives a l'épidémie de COVID-19 (Hospital data related to the COVID-19 epidemic)

Simulating the spread of COVID-19 via a spatially-resolved susceptible-exposed-infected-recovereddeceased (SEIRD) model with heterogeneous diffusion

Diffusion-reaction compartmental models formulated in a continuum mechanics framework: application to COVID-19, mathematical analysis, and numerical study

Formal analysis; Software; Validation; Data curation; Co-Writing Didier Georges and Clementine Prieur: Conceptualization; Methodology; Investigation

Methodology; Supervision; Validation; Co-Writing, Reviewing and Editing 1

[Instructions: Please check all applicable boxes and provide additional information as requested.]

Potential conflict of interest exists:We wish to draw the attention of the Editor to the following facts, which may be considered as potential conflicts of interest, and to significant financial contributions to this work:The nature of potential conflict of interest is described below:X No conflict of interest exists.We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Funding was received for this work.All of the sources of funding for the work described in this publication are acknowledged below:X No funding was received for this work.

X We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property.