key: cord-288079-rr8h5dgy
authors: Prague, Melanie; Wittkop, Linda; Clairon, Quentin; Dutartre, Dan; Thiebaut, Rodolphe; Hejblum, Boris Pierre
title: Population modeling of early COVID-19 epidemic dynamics in French regions and estimation of the lockdown impact on infection rate
date: 2020-04-24
journal: nan
DOI: 10.1101/2020.04.21.20073536
sha: 
doc_id: 288079
cord_uid: rr8h5dgy

We propose a population approach to model the beginning of the French COVID-19 epidemic at the regional level. We rely on an extended Susceptible-Exposed-Infectious-Recovered (SEIR) mechanistic model, a simplified representation of the average epidemic process. Combining several French public datasets on the early dynamics of the epidemic, we estimate region-specific key parameters conditionally on this mechanistic model through Stochastic Approximation Expectation Maximization (SAEM) optimization using Monolix software. We thus estimate basic reproductive numbers by region before isolation (between 2.4 and 3.1), the percentage of infected people over time (between 2.0 and 5.9% as of May 11th, 2020) and the impact of nationwide household confinement on the infection rate (decreasing the transmission rate by 72% toward a Re ranging from 0.7 to 0.9). We conclude that a lifting of the lockdown should be accompanied by further interventions to avoid an epidemic rebound.

In December 2019, grouped pneumonia cases have been described in the Hubei province, China and SARS-CoV2 was identified on January, 7 th as the cause of this outbreak (Li et al., 2020a; Zhu et al., 2020) . SARS-CoV2 causes the viral disease which has been named COVID-19 (World Health Organization, 2020b) .

SARS-CoV2 rapidly spread all over the world and the pandemic stage was declared on March 11 th by the World Health Organization (2020c). On April 28 th , over 1,773,084 cases (in accordance with the applied case definitions and testing strategies in the affected countries) including 111,652 deaths have been reported (World Health Organization, 2020a) .

The first case in France was declared on January, 24 th (Bernard-Stoecklin et al., 2020) and on April 13 th , Santé Publique France reported 98,076 confirmed cases and 14,967 hospital deaths due to COVID-19 includes non-specific symptoms such as fever, cough, headache, and specific symptoms such as loss of smell and taste (Gane et al., 2020; Greenhalgh et al., 2020) . The virus is transmitted through droplets and close unprotected contact with infected cases. The majority (around 80 %) of infected cases have a mild form (upper respiratory infection symptoms) without specific needs in terms of care. Around 20 % of cases need hospitalization and among those are severe forms (severe respiratory distress) which will need to be admitted to intensive care units (ICU) with potential need of mechanical ventilation. The percentage of patients in need for ICU care varies between 5 % reported from China (Guan et al., 2020) and 16 % reported from Italy (Grasselli et al., 2020) . The number of ICU beds in France was 5,058 at the end of 2018 (DREES, 2019) (although it is currently being increased, having doubled and aiming to reach 14,000 according to the French minister of Health). Thus, the availability of ICU beds with mechanical ventilation is one of the major issues as facilities are not prepared to deal with the potential increase of the number of patients due to this epidemic.

Unprecedented public-health interventions have been taken all over the world (Kraemer et al., 2020) to tackle this epidemic. In France, interventions such as heightening surveillance with rapid identification of cases, isolation, contact tracing, and follow-up of potential contacts were initially implemented. But as the epidemic continued growing, comprehensive physical distancing measures have been applied since March 15 th , 2020 including closing of restaurants, non-vital business, schools and universities etc, quickly followed by state-wide lockdown on March 17 th 2020. The president has announced on April 13 th 2020, a progressive lifting of the lockdown from May 11 th 2020 onwards. In Wuhan (Hubei, China), the extremely comprehensive physical distancing measures in place since January 23 rd have started to be relaxed after 2 months of quarantine and lifted completely on April 8 th 2020 (Tian et al., 2020; Wu and McGoogan, 2020) .

Interestingly, these interventions have been informed by mathematical models used to estimate the epidemic key parameters as well as unmeasured compartments such as the number of infected people. Another interesting outcome is the forecast of the COVID-19 epidemic according to potential interventions.

Several models have already been proposed to model and forecast the COVID-19 epidemic using compartment models (Fang et al., 2020; Tang et al., 2020; or agent based models (Di Domenico et al., 2020a; Ferguson et al., 2020; Wilder et al., 2020) , its potential impact on intensive care systems (Fox et al., 2020; Massonnaud et al., 2020) , and to estimate the effect of containment measurements on the dynamics of the epidemic (Magal and Webb, 2020; Prem et al., 2020) . Most of those rely on simulations with fixed parameters and do not perform direct statistical estimations from incident data (Massonnaud et al., 2020) . Roques et al. (2020) used French national data but did not use a population approach to model the epidemic at a finer geographical granularity. Yet, the dynamics of the epidemic can be very heterogeneous between regions inside a given country resulting in tremendous differences in terms of needs for hospital and ICU beds (Massonnaud et al., 2020) . Moreover, the data collection yields noisy observations, that we deal with a statistical modeling of the observation process, rather than altering the data by e.g. smoothing such as in Roques et al. (2020) .

In the present study, we use public data from the COVID-19 outbreak in France to estimate the dynamics of the COVID-19 epidemic in France at the regional level. We model the epidemic with a SEIRAH model, which is an extended Susceptible-Exposed-Infectious-Recovered (SEIR) model accounting for time-varying population movements, non-reported infectious subjects (A for unascertained) and hospitalized subjects (H) as proposed by to model the epidemic in Wuhan. Parameters from the model are estimated at the regional scale using a population approach which allows for borrowing information across regions, increasing the amount of data and thereby strengthening the inference while allowing for local disparities in the epidemic dynamics. Furthermore, we use forward simulations to predict the effect of non-pharmaceutical interventions (NPI) (such as lift of lockdown) on ICU bed availability and on the evolution of the epidemic. Section 2 introduces the data, the model and the necessary statistical tools, Section 3 presents our results and Section 4 discusses our findings and their limits.

Because epidemics spread through direct contacts, their dynamics have a strong spatial component. While traditional compartment models do not account for spatiality, we propose to take it into account by: i) modeling the epidemic at a finer, more homogeneous geographical scale (this is particularly important once lockdown is in place); ii) by using a population approach with random effects across French regions which allows each region to have relatively different dynamics while taking all information into account for the estimation of model parameters iii) aligning the initial starting time of the epidemic for all regions. The starting date in each region was defined as the first date with incident confirmed cases of COVID-19 directly followed by 3 additional consecutive days with incident confirmed cases as well. This criterion of 4 consecutive days with incident cases is needed in particular for the Île-de-France region which had 3 consecutive days with 1 imported confirmed case in late January which did not lead to a spreading outbreak at that time.

Open-data regarding the French COVID-19 epidemic is currently scarce, as the epidemic is still unfolding. Santé Publique France (SPF) in coordination with the French regional health agencies (Agences Régionales de Santé -ARS) has been reporting a number of aggregated statistics at various geographical resolutions since the beginning of the epidemic. During the first weeks of the epidemic in France, SPF was reporting the cumulative number of confirmed COVID-19 cases with a positive PCR test. Other French surveillance resources such as the Réseau Sentinelles (Valleron et al., 1986) or the SurSaUD R database (Caserio-Schönemann et al., 2014) quickly shifted their focus towards COVID-19, leveraging existing tools to monitor the ongoing epidemic in real time, making as much data available as possible (given privacy concerns). In this study, we combined data from three different opendata sources: i) the daily release from SPF; ii) the SurSaUD R database that started recording visits to the Emergency room for suspicion of COVID-19 on February, 24 th ; iii) the Réseau Sentinelles which started estimating the weekly incidence of COVID-19 in each French region on March 16 th . From the daily release of SPF, we computed the daily incident number of confirmed COVID-19 cases (i.e. with a positive PCR test) in each region. In 4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 24, 2020. . addition we used the incident number of visits to the emergency room for suspicion of COVID-19 in each region from the SurSaUD R database using the OSCOUR R network that encompasses more than 86% of all French emergency services (Caserio-Schönemann et al., 2014) . Although this does not represent the full extend of hospitalized COVID-19 cases, it is the only public data available that early in the epidemic, when the majority of COVID-19 cases at hospitals were admitted through emergency rooms. Finally, we used the Réseau Sentinelles network's weekly incidence estimates of symptomatic cases (including non confirmed cases) to set the ratio between ascertained and unascertained cases in each region (later denoted as r i ). Table 1 presents these observed data. Of note, we studied the epidemic in the 12 Metropolitan French regions -excluding the Corsican region (Corse) which exhibits different epidemic dynamics, possibly due to its insular nature.

Wang et al. (2020) extended the classic SEIR model to differentiate between different statuses for infected individuals: ascertained cases, unascertained cases and cases quarantined by hospitalization. The model, assuming no population movement, is presented in Figure 1 . The population is divided into 6 compartments: susceptible S, latent E, ascertained infectious I, unascertained infectious A, hospitalized infectious H, and removed R (recovered and deceased). This model assumes that infections are well-mixed throughout the population, ignoring any spatial structure or compartmentalization by population descriptors such as age. Such assumptions make it particularly relevant to infer the dynamics of the French epidemic at the regional level (a finer geographical scale at which such hypotheses are more likely to hold). Figure 1 illustrates the dynamics between those 6 compartments that are characterized by the following system of six Ordinary Differential Equations (ODE):

CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

Model parameters are described in Table 2 . Of note, given a combination of parameters and initial states of the system ξ =

, using a solver of differential equations, it is possible to deterministically compute at any time t the quantities S(t, ξ), E(t, ξ), I(t, ξ), R(t, ξ), A(t, ξ), and H(t, ξ). 6 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. Observation processes In our case, none of the compartments of the system are directly observed: the only observations considered are i) the number of daily incident infectious ascertained cases denoted Y 1 , and ii) the number of daily incident hospitalized infectious cases denoted Y 2 . These observations are the only one available both before and after the initiation of lockdown. Those two quantities are modeled in Equation (1) respectively as observations from the I (in) (t, ξ) = rE(t,ξ)

De and H (in) (t, ξ) = I(t,ξ) Dq random variables, which are the numbers of new incident cases at time t given the parameters ξ in compartment I and H respectively. Because these are count processes, we propose to model their observations Y 1 and Y 2 with Poisson likelihoods:

where k 1 and k 2 are the respective numbers of cases.

8 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

The initial states of all compartments at the date of epidemic start (t = 0) for region i are also important drivers of the dynamics. Some of them can be approximated by quantities directly depending on the observations: i)

, and iii) R i (t = 0) = 0. Others, namely compartments A and E are not directly observed, and we evaluate these initial quantities. Due to variation in data collection protocols and the initial size of regional outbreaks, this estimation is particularly important. Indeed, the number of daily incident cases at t = 0 ranges from 1 to 37 cases depending on the region. A i (t = 0) is set as 

The goal of this study is to model the epidemic of COVID-19 in France, but at the regional level using a population approach. This is done using a mixed effect model. In this inference framework, baseline parameters governing the dynamics of the epidemic in each region are assumed to be drawn from a shared distribution which allows for heterogeneity between regions, known as the random effects. We use the log-normal distribution for all parameters to ensure their positivity during estimation. Because public health policies changed over the time period of observation of the epidemic, we incorporate explanatory covariates such as physical distancing by lockdown (C 1 and C 2 ) as a time-dependent effect on the transmission of the disease b. Covariate C 1 is 0 until 2020-03-17 date of the start of the policy in France and then set to 1. Covariate C 2 is 0 until 2020-03-25 assuming that social distancing behaviours build up in a week. In other words, we have ∀i = 1, . . . , 12 (where i is the region identifier):

The parameters (b 0 , D q 0 , E(t = 0)) are mean shared values in the population, and can be seen as the country values for these parameters. The inter-region random-effect (u b i , u Dq i , u E 0 i ) are normally distributed and assumed independent. So the vector of parameters in the model for each region is

CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

which is the factor by which transmission is modified after the start of lockdown during the first week. The factor by which transmission is modified after that first week of confinement is given by K 2 = exp(−β 1 − β 2 ). The coefficients β = (β 1 , β 2 ) are expected to be negative as lockdown aims at reducing transmission. Interestingly, with our approach, we can evaluate whether or not there is a statistically significant effect of lockdown on the transmission by testing β = 0 using a Wald test.

Region-specific model parameters Based on the results from the theoretical identifiability analysis of the structural model of the epidemic from Equation (1) (see Appendix A2), we estimate the parameters (b i , D q i , r i ) as well as the initial state (E 0i ) when the epidemic begins being reported in each region i. We used the Monolix software version 2019R2 (Lixoft SAS, 2019) to estimate those five parameters by maximizing the likelihood of the data given the model and the other fixed parameters. This software relies on a frequentist version of the Stochastic Approximation Expectation Maximization (SAEM) algorithm (Delyon et al., 1999) and standard errors are calculated via estimation of the Fisher Information Matrix (Kuhn and Lavielle, 2005) , which is derived from the second derivative of the log-likelihood evaluated by importance sampling. In addition, we use profile-likelihood to confirm that no further information can be gained from the data at hand on parameters α, D e and D I by running the SAEM algorithm multiple times while setting these parameters to different values and obtaining similar maximum likelihood values (meaning more data would be needed to be able to estimate those parameters). During inference, practical identifiability of the model is evaluated by the ratio of the minimum and maximum eigenvalues of the Fisher Information Matrix, that will be referred as "convergence ratio" in the reminder of the manuscript. Convergence of the SAEM algorithm was assessed by running multiple SAEM chains and checking that they all mix around similar probability distributions.

We are particularly interested in the trajectories of the model compartments. We use Monte Carlo methods (parametric bootstrapping) to compute the confidence intervals accounting for the uncertainty in estimating the structural and statistical model parameters. For 10 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . all compartments C(t, ξ i ) (C being S, E, I, R, A, or H) the 95% confidence interval is estimated by sampling from the posterior distributions of the model parameters to simulate 1,000 trajectories, and taking the 2.5% and 97.5% percentiles of these simulated trajectories. We also added to it the error measurement given by the Poisson distribution of the outcomes. Other outcomes of interest are the number of ICU beds needed and the number of death (D) in a given region at a given time. These quantities are not specifically modeled by our mathematical structural model. However, it is possible to roughly approximate them by assuming that they represent a percentage of the hospitalized cases H(t, ξ i ) and removed cases R(t, ξ i ). We assume that ICU (t, ξ i ) = 0.25 × H(t, ξ i ) which is consistent with the prevalence of ICU cases among hospitalized cases at the French national level. Based on the estimation of the Infection Fatality Ratio (IFR) from Roques et al. (2020), we get a rough estimation of D(t, ξ i ) as 0.5% of R(t, ξ i ). Roques et al. (2020) conclude that COVID-19 fatalities are under-reported, and using their IFR estimate we adequately fit the trend of the observed COVID-19 deaths but with an offset due to this assumed higher IFR, see Appendix A1.

Model update Furthermore β estimations can be easily updated as new data become available using parametric empirical Bayes (thus avoiding the need to re-estimate the whole system). It consists in maximizing the likelihood again with respect to β while holding the other parameter distribution fixed to their previously inferred a posteriori distribution. This is how our results are updated with data after March 25 th 2020 in this work.

Effective reproductive number For each region, we compute the effective reproductive number R e (t, ξ i ) as a function of model parameters:

When individuals are homogeneous and mix uniformly, R e (t, ξ i ) is defined as the mean number of infections generated during the infectious period of a single infectious case in the region i at time t. This is the key parameter targeted by NPIs. We compute analytically its 95% confidence interval by accounting for all 95% confidence interval [X min ; X max ] of parameters and 11 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . trajectories X used in its definition such that:

Asymptomatic proportion At a given time t the number of incident unascertained cases is equal to the sum of two populations, the number of incident non-tested symptomatic individuals (NT) and the number of incident non-tested asymptomatic individuals (AS):

where r is the proportion of cases tested positive. Collection of data from general practitioners through the re-purposing of the Réseau Sentinelles network to monitor COVID-19 provides a weekly estimation of the number of incident symptomatic cases (tested or not tested) that we previously called r s . This quantity is given over a week but can be evaluated daily by averaging:

where r s represents the proportion of infected cases seeing a general practitioner. Combining equations (6) and (7) allows to compute the incident number of asymptomatic cases as a function of the compartment E:

Short-term predictions of attack rates We predict the proportion of infected individuals among the population in each region at a given date by computing:

12 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

Given the values of parameters ξ i , we predict the trajectories of the dynamical system compartments using the dlsode differential equation solver in R (Soetaert et al., 2020) and we can investigate the impact of NPIs such as lockdown in various scenarios. This impact is driven by two parameters:

• K 2 (K 1 is considered fixed), the decrease ratio of transmission rate of the disease following the first week of NPI, defined as

This directly translates into a decrease of the effective reproductive number according to Equation (4). It reflects the fact that individual by getting confined decrease their number of contacts. Of note, the current K 2 is estimated by exp(− β 1 − β 2 ) (see Section 2.2.3).

• τ , the duration (in days) of the lockdown during which the mixing and transmission are fixed to

We evaluate the magnitude of the possible epidemic rebound after confinement according to several values for K 2 . In particular, we predict the rates E(t, ξ i )/N i , A(t, ξ i )/N i and I(t, ξ i )/N i on May 11 th 2020 (currently considered by French authorities as the possible start date for lifting lockdown in France). We also compute the optimal lockdown duration τ opt i needed to achieve the epidemic extinction in region i defined as E i (t, ξ i ) < 1 and A i (t, ξ i ) < 1 and I i (t, ξ i ) < 1 simultaneously. In each scenario we predict the date at which the ICU capacities in each region would be overloaded, this date is given by:

η i denoting the ICU capacities limits in region i. We additionally predict how many more ICU beds would be needed at the peak of hospitalization in each scenario, as a proportion of current ICU capacity (DREES, 2019). Finally we provide a rough prediction of the number of deaths for each envisioned scenario assuming a confinement duration of τ opt i days.

Data fitting Because of the lag inherent to diagnostic testing, we also estimated the number of people already infected at this epidemic start by E 0 (notably, the largest numbers of E 0 in Table 3 are estimated for Île-de-France and Grand Est the two most affected French regions in this early epidemic).

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

On March 25 th , the cumulative number of ascertained cases was 24,623 and the cumulative number of hospitalized cases was 13,388, see Table 1 for a regional breakdown. Our SEIRAH model fits the data well as can be seen in Figure 2 . Moreover, the stability of the estimates is good with a convergence ratio of 1.6 (see Section 2.3.1), corroborating the good identifiability of the estimated parameters (see Appendix A2). Table 3 provides the regional estimates of the transmission rates (b i ). Of note, regions with higher transmission rate are not necessarily those known to have the highest number of incident ascertained cases. D q i , the number of days from illness onset to hospitalization, can be quite variable between regions and likely accounts for heterogeneity in the observed data. We estimates its population mean at D q 0 = 1.13 days with a standard deviation σ Dq = 0.42.

To evaluate the validity of our structural ODE model (1) and of our inference results, we compare the aggregated predictions of the number of both incident ascertained cases and incident hospitalized cases at the national French level to the daily observed incidences (that are still publicly available from SPF at the country level, even after March 25 th -while incident ascertained cases are not openly availilable at the regional level after March 25 th ). Figure 3 displays both predictions and observations, illustrating the added value of incorporating data after March 25 th as those encompass new information about the epidemic dynamics and characteristics as we approach the peak in most regions (notably Grand Est and Île de France). Of note, worse fit of the observed hospitalizations by our model can be explained by the data discrepancy: while we use the SurSaUD R data for our inference (which only account for patients arriving through the emergency room), Figure 3 displays the data from Santé Publique France which should contain all COVID-19 hospitalizations in France (including hospitalizations not coming from the emergency room, as well as Corsica and the French Departements d'Outre Mer that are not taken into account in our model).

14 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. Evolution of the epidemic without intervention It is also interesting to predict the percentage of infected individuals in each region at future dates, corresponding to attack rates. 

Change of transmission rate during lockdown The parameter β 1 and β 2 measure the effect of the lockdown before and after a week of adjustment. Both are significantly different from 0 (p<0.001) such that the lockdown reduced the transmission rate of COVID-19 by a divisive factor K 1 estimated at 1.31 [1.27; 1.35] during the first week and K 2 estimated at 3.63 [3.48; 3.80] after this first week. Of note, thanks to our update algorithm (see Section 2.3.1), it is possible to update those results rapidly as soon as more data are available to inform which scenario of prediction described in Section 3.3 is the most likely.

Effective reproductive number The above quantities directly impact the effective reproductive number as described in Figure 4 displays the effective reproductive number trajectories in each region.

In Tables 5 and 6, we vary K 2 = 3, 5, 10 (the magnitude of the reduction of transmission during lockdown after the first week). This gradient of simulation is important because the actual French value of K 2 remains currently unknown. In Section 3.2 we showed that we can estimate a lower bound for K 2 ≥ K 2 = 3.48. From Table 5 , we show that the higher K 2 is, the lowest 17 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . 

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. Table 4 : Estimation of the effective reproductive ratios R e during each of the 3 considered periods (before lockdown, during the first week of lockdown, and beyond 1 week of lockdown) for each region with 95% confidence intervals.

the numbers of ascertained (I), unascertained (A) and latent (E) infected individuals are on May 11 th 2020. However, it is not equal to 0, which means the epidemic is not extinct (and ready to bounce back as soon as lockdown is lifted). In Table 6 , we predict the optimal (i.e. shortest) duration of the lockdown to achieve extinction of the epidemic in each region, which is, Table 7 presents the proportion of infected individuals at various dates, i.e. instantaneous attack rates. We also predict them for three horizon dates assuming confinement would be maintained until these dates: 2020-05-15, 2020-06-08 and 2020-06-22. We predict the national French attack rate on May 15 th 2020 to be 3.8% [3.1%; 4.8%].

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. 

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . 

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. Table 7 : Model predictions for the proportion of Infected and Immunized in the population (deaths not taken into account), assuming continued lockdown until then.

.

It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . lockdown lift on May 11 th 2020 We simulated the effect of lifting the lockdown on May 11 th 2020 assuming that after this date the transmission goes back to its value before lockdown. Figure 5 shows the predicted dynamics for each region. In every region, we observed a large rebound occurring either in June or July. The timing and magnitude of this rebound is largely influenced by the importance of the first wave, that is successfully contained thanks to the lockdown. These results strongly argue for enforcing other NPIs when lockdown is lifted in order to contain R e below 1 and prevent this predictable rebound of the epidemic.

In this work, we provide estimations of the key parameters of the dynamics of the COVID-19 epidemic in French regions as well as forecasts according to NPIs especially regarding the proportion of infected when lifting the lockdown policy.

The point estimates of the basic reproductive ratios for French regions fluctuated between 2.4 and 3.4 before lockdown took effect, but according to the uncertainty around these estimates they are not substantially different from one region to another. Therefore, observed differences in the number of cases were due to the epicemic starting first in Grand Est and Île-de-France regions. These estimates were close to those reported before isolation using other models (Alizon et al., 2020; Flaxman et al., 2020) . The model provided estimates of the impact of the lockdown on the effective reproductive ratio and although recent data led to a substantial reduction of R e after the lockdown, it remains close to 1 thus without a clear extinction of the epidemic. These estimates should be updated with more recent data that may lead to an estimated R below 1. On the other hand, it is an argument to add other measures such as intensive testing and strict isolation of cases. In addition, the model provides estimates of the size of the population of people who have been or are currently infected. As already reported (Di Domenico et al., 2020a) , this proportion of subjects is around 2 to 4 percent, so excluding any herd immunity and control of the epidemic by having a large proportion of people already infected and therefore not susceptible. With our estimates of basic reproduction ratio, the epidemic would become extinct by herd immunity with a proportion of 89.5% (95% CI [88.0%; 90.7%]) of infected people.

Interpretation of our results is conditional on the mechanistic model illustrated in Figure 1 , and careful attention must be given to the parameters set 24 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . Table 2 , as updated estimates are published every day. First and foremost, our model takes only two kinds of infectious cases into account: confirmed cases I, and unascertained cases A. Our observation model takes I as the number of infectious cases confirmed by a positive PCR SARS-Cov-2 test. Thus, A can be interpreted as 25 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . unconfirmed symptomatic cases that can be diagnosed by a GP visit (possibly through remote teleconsultation). This is a very simple representation of the COVID-19 infection, which can have various degrees of severity (e.g. asymptomatic, mild, severe) that could be themselves modeled into different compartments. However, very little data is currently available to gather sufficient information to be able to distinguish between those infectious states. Second, our model does not have a compartment for COVID-19 patients in ICU, and the number of occupied ICU beds is simply taken as a fixed percentage (25% based on an estimate from the Bordeaux CHU University Hospital). Meanwhile ICU bed capacity does not account for the recent surge of available ICU beds in response to the COVID-19 epidemic. Compared to , our model does not feature an inflow of susceptibles n (and matching outflow) but population movement across regions are limited during the isolation period (see Appendix A3 for a thorough discussion). Deaths were also not distinguished from recoveries in the R compartment, but over the observation period this did not impact the main estimates. Third, our model does not take into account the age-structure of the population on the contrary to the recently posted report Salje et al. (2020) using French data. Interestingly, although the models were different and the data not fully identical, our results were comparable. Actually, our approach captures a part of the unexplained variability between regions through the random effects. This variability might be explained at least partly through the difference in age-structure and probability of hospitalization according to the age.

We would like to underline the interest of making the data publicly accessible as done by Santé Publique France on the data.gouv.fr web portal, hence allowing our group to work immediately on the topics. Furthermore, we have made our code fully available on GitHub www.github.com/sistm/ SEIRcovid19 and we are currently working on packaging our software for facilitating its dissemination and re-use.

In conclusion, the lockdown has clearly helped controlling the epidemics in France in every region. The number of infected people varies from one region to the other because of the variations in the epidemic start in these regions (both in terms of timing and size). Hence, the predicted proportion of infected people as of May 11 varies, but stays below 10 % everywhere. It is clear from this model, as in other published models (Di Domenico et al., 2020b; Flaxman et al., 2020) , that a full and instantaneous lockdown lift would lead to a rebound. Additional measures may help in controlling the number of new infections such as strict case isolation, contact tracing (Di Domenico et al., 2020b) and certainly a protective vaccine for which the 26 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . strategy of administration to the population remains to be defined (Amanat and Krammer, 2020; Lurie et al., 2020; Thanh et al., 2020) .

The data from the SurSaUD R database regarding COVID-19 is available from the data.gouv French government platform at https://www.data. gouv.fr/fr/datasets/donnees-des-urgences-hospitalieres-et-de-sosmedecins-relatives-a-lepidemie-de-covid-19. The source code used for this work is available on GitHub at www.github.com/sistm/SEIRcovid19. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study. Lancet Public Health.

Raue, A., Kreutz, C., Maiwald, T., Bachmann, J., Schilling, M., Klingmüller, U., and Timmer, J. (2009) . Structural and practical identifiability analysis of partially observed dynamical models by exploiting the profile likelihood. Bioinformatics, 25(15):1923 -1929 . Roques, L., Klein, E., Papaix, J., Sar, A., and Soubeyrand, S. (2020 . Using early data to estimate the actual infection fatality ratio from covid-19 in france. medRxiv. Tian, H., Liu, Y., Li, Y., Wu, C.-H., Chen, B., Kraemer, M. U. G., Li, B., Cai, J., Xu, B., Yang, Q., Wang, B., Yang, P., Cui, Y., Song, Y., Zheng, P., Wang, Q., Bjornstad, O. N., Yang, R., Grenfell, B. T., Pybus, O. G., and Dye, C. (2020) . An investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china. Science.

Valleron, A.-J., Bouvet, E., Garnerin, P., Ménares, J., Heard, I., Letrait, S., and Lefaucheux, J. (1986) . A computer network for the surveillance of communicable diseases: the french experiment. American journal of public health, 76(11):1289-1292.

Wang, C., Liu, L., Hao, X., Guo, H., Wang, Q., Huang, J., He, N., Yu, H., Lin, X., Pan, A., Wei, S., and Wu, T. (2020) . Evolving epidemiology and 31 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . Figure S1 : Cumulative incidence hospitalization for COVID-19 at France national level according to either Santé publique France or the SurSaUD R database 33 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

Of note, the earlier slowing down of emergency admissions for COVID-19 suspicions observed in the SurSaUD R data compared to the SI-VIC data could lead to an optimistic biais regarding the epidemic stage and evolution in our estimations and predictions . However, since the SI-VIC data are not publicly accessible before March 18 th , we are currently unable to use this data source to estimate the impact of the lockdown.

Under-reporting or over-estimation of COVID-19 deaths According to Roques et al. (2020) , the Infection Fatality Ratio (IFR) for COVID-19 is 0.5% (95%-CI: 0.3; 0.8). Figure S2 shows that using this IFR over-estimate compared to the death currently reported by Santé Publique France. Figure S2 : Observed incident number of ascertained cases and hospitalization at France national level compared to predicted ones, based on estimations with data collected up to either March 16 (before lockdown), March 25 th or April 6 th (delimited by the vertical line).

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. .

Based on ODE model (1), the statistical analysis of the population evolution can be turned into a parameter estimation problem. Theoretical identifiability (i.e. the possibility of learning the true values of the model parameters) of epidemiological compartment models is rarely checked. Although it relies on unrealistic hypothesis (namely that incident number of ascertained cases and hospitalized cases are observed continuously, without noise and for an infinite length of time in our case), this framework provides important guarantees for the results. To determine which parameters from Table  2 can be accurately identified from the available observations we first evaluate the identifiability of this SEIRAH structure with the DAISY software from Bellu et al. (2007) (based on differential algebra results). We conclude that there is global identifiability of ξ = (b, D q , r, D e , D I ) with known α and D h , even if initial conditions are unknown -which is our case for (E 0 , A 0 ). Practical identifiability (for which most existing evaluation methods rely on estimations and are based on Fisher information matrix or likelihood profiling (Raue et al., 2009)) of these parameters will be discussed in section 2.3. Figure S3 present the occupancy numbers of the six compartments of the model when one parameter is varied while the others are fixed. Because the epidemics start date and state is different in each region, it is necessary to estimate in priority E 0 and A 0 . b, D e and D I have a similar impact on the simulations, and we chose to prioritize the estimation of the transmission b as it is a important actionable driver of the epidemics, that can potentially be reduced by interventions. r and D q have little impact on I (in) but are informative for H (in) . Since r can be estimated from other data sources, we prioritized the estimation of D q . In addition, there is a strong rational to estimate b and D q as they are directly involved in the computation of the reproductive number. So in the end, we assume D e , D I , and r fixed and we will estimate E 0 , A 0 , b, and D q . Both D e and D I are set to a common value across regions, estimated from previous studies referenced in Table  2 . To set r, we used data collected from general practitioners through the re-purposing of the Réseau Sentinelles network to monitor COVID-19 and provide a weekly estimation of the number of incident symptomatic cases (regardless of their confirmation status through a PCR test) at the region level. Thus, for each region, r i is set to r s the ratio of observed incident number of ascertained cases over the incident number of symptomatic cases (as defined by the incident symptomatic cases reported by the Réseau Sentinelles network). Values are provided in Table 1. 35 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted April 24, 2020. . is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 24, 2020. .

The model ( (1)) is a simplified version of the ODE system presented in Wang et al. (2020):

In the ODE system (9), consider the possibility of an exogenous flow of susceptible modeled by a sustained susceptible inflow n. While this make sense for modeling an outbreak in a region surrounded by other regions free of the epidemic (such as the first outbreak of COVID-19 in Wuhan (Hubei, China) in late 2019 -early 2020), it is not suitable for a pandemic where the pathogen is circulating in all regions, which is the current situation in all of France COVID-19.

Nonetheless we study the possible asymptotic steady states of this model, i.e. the constant values S (0) , E (0) , I (0) , R (0) , A (0) , H (0) possibly reached by the system when t → ∞. Without loss of generality, we assume the initial number of removed subjects is set to R(t = 0) = 0. By definition, the steady 37 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 24, 2020 . . https://doi.org/10.1101 states verify:

with β 1 = D e /r (1/D q + 1/D i ) and β 2 = D h /D q .

In the particular case of n = 0, the fifth equation in (10) can be simplified and imposes A (0) = D i (1 − r)β 1 I (0) /D e which in combination with the third equation leads to I (0) = 0 and thus A (0) = H (0) = E (0) = 0. Assuming that A(t) = I(t) = 0 for all times t and applying the population conservation principle we identify (N, 0, 0, 0, 0, 0) as a steady state. Since the ODE imposes that S is strictly decreasing as long as S(t) > 0 and A(t) > 0 (or I(t) > 0) and that S and A (or I) cannot reach 0 in finite time if A 0 = 0 (or I 0 = 0), this steady state (N, 0, 0, 0, 0, 0) is unstable. Having A 0 = 0 (or I 0 = 0) leaves us with S (0) = 0 as the only possible asymptotic steady state value and since conservation principle imposes S 0 + E 0 + I 0 + R 0 + A 0 + H 0 = N , we end up with R (0) = N and we identify S (0) , E (0) , I (0) , R (0) , A (0) , H (0) = (0, 0, 0, N, 0, 0), as the only possible asymptotic steady when A 0 = 0 (or I 0 = 0).

But in presence of an inflow of susceptible n = 0, can we expect an asymptotic behavior with extinction of the epidemic. If we set A (0) = I (0) = 0 in (10), the first and fourth equations give us:

hence the potential steady state corresponding to the extinction imposes S (0) = N , R (0) = 0 and (0, 0, 0, N, 0, 0) is no longer a possible steady state point. The only steady state (N, 0, 0, 0, 0, 0) corresponds to the case where no one gets infected nor removed i.e when the epidemic does not start at all.

. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 24, 2020. . In this new setting, A (0) , S (0) and R (0) are now functions of I (0) and given by:

with β 4 = (1 − r)β 1 D i /D e . From algebraic manipulation of (10), we derive that a necessary and sufficient condition for I (0) to constitute a steady set is to be a positive real solution of the fourth-order polynomial equation:

bD e S (0) (nD i + (αβ 4 + 1) (N − (1 + β 2 )I (0) ))(N − (1 + β 2 )I (0) ) −β 1 N (nD e + N − (1 + β 2 )I (0) ) nD i + N − (1 + β 2 )I (0) = 0 39 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted April 24, 2020. .

Sars-cov-2 vaccines: status report. Immunity

Daisy: A new software tool to test global identifiability of biological and physiological systems

The French syndromic surveillance system SurSaUD R

Convergence of a Stochastic Approximation Version of the EM Algorithm

Report #8: Expected impact of school closure and telework to mitigate covid-19 epidemic in france

Report #9: Expected impact of lockdown in île-de-france and possible exit strategies

Statistique annuelle des établissements de santé

Transmission dynamics of the covid-19 outbreak and effectiveness of government interventions: A data-driven analysis

Modelling the impact of covid-19 upon intensive care services in new south wales

Isolated sudden onset anosmia in covid-19 infection

Critical Care Utilization for the COVID-19 Outbreak in Lombardy, Italy: Early Experience and Forecast During an Emergency Response

Covid-19: a remote assessment in primary care

The effect of human mobility

Maximum likelihood estimation in nonlinear mixed effects models

The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov2)

Monolix v2019r2

Developing covid-19 vaccines at pandemic speed

Predicting the number of reported and unreported cases for the covid-19 epidemic in south korea, italy, france and germany. medRxiv. impact of non-pharmaceutical interventions on the outbreak of coronavirus disease

The role of age distribution and family structure on covid-19 dynamics: A preliminary modeling assessment for hubei and lombardy

Coronavirus disease 2019 (covid-19) situation report -84

Naming the coronavirus disease (covid-19) and the virus that causes it

Who director-general's opening remarks at the media briefing on covid-19 -11

Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72,314 Cases From the Chinese Center for Disease Control and Prevention

A Novel Coronavirus from Patients with Pneumonia in China

The authors thank Romain Greffier for his time in discussing the aggregated features of COVID-19 patients care at the Bordeaux University Hospital. The authors thank the opencovid-19 initiative for their contribution in opening the data used in this article. BPH thanks Vincent Pey for discussions about the clinical characteristics of the COVID-19 infection. This work is supported in part by the GESTEPID Inria Mission COVID19.

The authors have no competing interests to declare.

MP, LW, RT and BPH designed the study. MP and BPH analyzed the data. MP, DD and BPH implemented the software code. QC performed identifiability and asymptotic analysis of the model. MP, LW, RT and BPH interpreted the results and wrote the manuscript.

One of the difficulties in modeling the COVID-19 epidemic in real-time is the reliability and comparability of data sources an reporting artifacts.Differences bewteen the SurSaUD R database and Santé publique France reports The Santé Publique France epidemiological reports cite SI-VIC as their source. SI-VIC is a special system that can be activated for the identification and monitoring of victims from terror attacks and exceptional health situations, and that was activated on March 13 th 2020 for the COVID-19 epidemic and is maintained by the French Agence du Numérique en Santé. Because SI-VIC is a declarative web-based plateform (with data populated by regional health agencies, emergency services and hospitals) it was likely missing some case declaration when first launched. On the contrary, the SurSaUD R database. The Figure S1 illustrate the discrepencies and differences between the two database.