key: cord-0947254-dm8ueot1
authors: Simoni, G.; Fochesato, A.; Reali, F.; Giordano, G.; Domenici, E.; Marchetti, L.
title: Short-term analysis and long-term predictions for the COVID-19 epidemic in a seasonality regime: the Italian case
date: 2020-07-16
journal: nan
DOI: 10.1101/2020.07.15.20154500
sha: 7cf2d394e5b60d11cd8c7abe1f4416ec884f0ab1
doc_id: 947254
cord_uid: dm8ueot1

As of July 14th, COVID-19 has caused in Italy 34.984 deaths and 243.344 infection cases. Strict lockdown policies were necessary to contain the first outbreak wave and prevent the Italian healthcare system from being overwhelmed by patients requiring intensive care. After the progressive reopening, predicting how the epidemic situation will evolve is urgent and fundamental to control any future outbreak and prevent a second wave. We defined a time-varying optimization procedure to repeatedly calibrate the SIDARTHE model with data up to June 24th. The computed parameter distributions allow us to robustly analyse how the epidemic situation evolved and outline possible future scenarios. Assuming a seasonal regime for COVID-19, we tested different lockdown policies. Our results suggest that an intermittent lockdown where six "open days" are allowed every other week may prevent a resurgent exponential outbreak and, at the same time, ease the societal burden of an extensive lockdown.

The new strain of coronavirus SARS-CoV-2, causing a severe and potentially fatal respiratory syndrome named COVID-19, was initially identified in the Hubei province of China in the late 2019 and rapidly spread out worldwide, forcing the World Health Organization (WHO) to declare the pandemic alert on March 11 th 2020 2 . Most governments have established tight policies to slow down the spread of COVID-19, which has caused 548.211 deaths on July 8 th . In Italy, one of the first and most affected Western countries, the interventions shifted from initial social distancing measures to a drastic nation-wide lockdown that started on March 11 th 2020, and lasted until May 4 th . A massive swab campaign was set up to detect and isolate infected people, symptomatic and later also asymptomatic. From the beginning of May, more than two months after the Italian outbreak on February 20 th 2020, the situation is becoming under control, with a steady decrease of the number of new confirmed cases as well as hospitalised and Intensive Care Units (ICUs) patients. Such encouraging data allowed a cautious relaxation of the restrictive measures: gradual reopening of economic activities and free circulation of people. In this context, the ability to monitor and predict the disease progress is crucial to guide policymakers in deciding how to prevent and contrast possible future outbreaks.

Mathematical models offer a data-driven and quantitative understanding of the disease, allowing for probabilistic insights and future scenario predictions. Many researchers have proposed models for COVID-19 [3] [4] [5] [6] , building upon the common SIR-SEIR models for human-to-human transmission to describe the current epidemic. In this paper, we analyse the different phases of the spread of the disease in Italy using the recent SIDARTHE model 1 , which captures the population granularity by distinguishing between detected and undetected infected subjects as well as between asymptomatic and symptomatic infected subjects. We include as Extended Figure 1 a graphical representation of the model, described in Methods.

We used a global optimization algorithm 7, 8 to estimate a subset of the model parameters on the basis of the data provided by the Italian Protezione Civile about detected cases, recovered, and deaths. We considered the period from February 24 th to June 24 th with an updating fitting strategy that reflects the shift in the adopted countermeasures, from full lockdown (Phase I), to partial restrictions (Phase II), to COVID-19-aware reopening (Phase III). The National Decrees of March 1 st , 11 th , 22 nd , April 10 th , May 4 th , 18 th , and the new swab policy of March 28 th have been used as critical events to mark the updates. We computed 100 repeated model calibrations to provide a robust distribution of each parameter estimate, thus relying on stable model predictions. More details about the time-varying calibration procedure can be found in Methods. Figure 1 shows the short-term evolution of the model dynamics compared with the data used for the calibrations. In Extended Figure 2 , the estimated model parameters show a decreasing trend for infection rates and an increasing trend for the recovery rates during the considered period. This reflects the efficacy of the restrictive measures in slowing down the virus transmission and the acquired experience in treating patients. In accordance with the model parameter trend, we observe a smooth decreasing curve for R0 that starts from a maximum value around 6 for February 24 th and crosses the critic value of 1 at the end of March 9 (Figure 2a ). The quartile curves highlight more confidence in the tail of the data period when the value of R0 became less than 1, suggesting a reliable estimation in the sensitive domain (0,1). Combining the granularity of the SIDARTHE model with our fitting protocol allows us to estimate the undetected cases, who would need to be traced and tested in order to control the spread of the epidemic 10, 11 . In agreement with the estimation of Pedersen et al. 12 , we observed an initial ratio between undetected and detected cases of around 10:1 (Figure 2b ). This result provides a possible quantitative explanation for the Italian exponential spread of COVID-19 in late February and early March due to the undetected circulation of the virus. In addition, the model suggests that the effects of the social distancing measures and the swab campaign led to a decrease of this ratio, up to a reversal reflecting an increasing control and management of the epidemic. Indeed, to avoid a second outbreak, it is crucial to identify and isolate the suspected infection cases with an efficacy testing policy [13] [14] [15] . In this way, we could be able to avoid an uncontrolled spread of the virus, thus reducing the risk of a new outbreak. Figure 2b highlights how the number of undetected cases is prevailing in the early stage of the epidemic curve, suggesting that a relevant number of asymptomatic and undetected patients were present in Italy [16] [17] [18] , as well as in the rest of Europe, even before the implementation of the COVID-19 surveillance WHO protocol in Europe in late January 19 . In this regard, we performed backward simulations to estimate the day zero of the epidemic in Italy. We tested different combinations of initial undetected asymptomatic and symptomatic cases to determine the time needed to reach the values reported on February 24 th . More details about the backward integration can be found in Methods. Our analysis suggests the presence of a few undetected cases in Italy already in the late November -early December, as assessed in a recent study that confirmed the presence of the virus in northern Italy wastewater on December 18 th , 2019 20 . A similar conclusion was drawn from the presence of IgM/IgG antibodies against SARS-CoV-2 Nucleocapsid protein in blood samples collected in Milan at the start of the outbreak 21 . 

Model predictions based on parameter estimations of late Phase I (April 14 th -May 4 th ) and late Phase II (May 18 th -June 15 th ) allowed us to compare the long-term epidemic curves for the two scenarios. In Figure 3 , we observe that, unexpectedly, the simulated scenario with late Phase I parameters (full lockdown) predicts more death and infected cases, and less recovered cases than the late Phase II data and simulations. This computational evidence suggests that, while the end of the lockdown did not perceptibly increase the infection rates, the epidemic may be less virulent in terms of pathological effects. The use of the late Phase II data for the comparison reinforces this hypothesis since only the first weeks (May 4 th -May 18 th ) are likely influenced by previous lockdown policies.

Despite the natural reported evolution of the SARS-Cov-2 virus [22] [23] [24] [25] , there are no clearcut genetic evidence of mutations that are weakening the virus. Thus, the virus may be losing strength in Italy due to external factors, such as weather conditions (temperature and humidity) and persistent consciousness and responsible behaviour (use of face masks, frequent hand washing, increased adoption of smart working), which may play a crucial role in the Italian COVID-19 evolution.

In the last part of our analysis, we focused on the evaluation of future scenarios. At this stage, it appears unlikely the eradication of COVID-19 only with social distancing measures and public health system efforts, as it was in 2003 for SARS-CoV-1 26 . On the contrary, SARS-CoV-2 may continue to circulate with a seasonal component, after the first epidemic wave, as it is for two different strains of human coronavirus, the HCoV-HKU1 and the HCoV-OC43 27, 28 . Since at the moment no vaccines or preventative pharmaceutical treatments are readily available for COVID-19, we investigate the effects of different social distancing measures on possible future seasonal waves of COVID-19. Given the negative impact of the lockdown on the economy 29 , we analysed a trade-off between controlling a new epidemic wave and allowing the commercial activities and industries to be open for a continuative period.

We tested long-term scenarios considering a gradual worsening of the epidemic situation in the autumn/winter period. We implemented a seasonality function, defined on the basis of hyperbolic tangents, which smoothly changes the contagion parameters. We simulated an increase in the infection rates in the period October-November to reach a condition similar to the one experienced in early March 2020. On the contrary, they were decreased in April-May to restore the spreading condition of the current summer period. All the mathematical details can be found in Methods. In our simulations, we impose an intermittent lockdown 30 (as opposed to the extended lockdown imposed during the first epidemic wave) and investigated the effect of an increasing number of "open days" (no lockdown) in each two-week period. The intermittent lockdown is introduced once the reported daily cases exceed 500 infected nation-wide for at least three consecutive days and is then simulated until the end of May. Indeed, starting from April, the spreading parameters are smoothly restored to the ones of the summer period (estimated in the subinterval May 18 th -June 24 th ) and the lockdown policy is no more necessary. Figure 4 shows the scenario where there are six "open days" every two weeks, while Extended Figure 3 reports the results for different combinations of "open days".

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 16, 2020. .

Comparing the model simulations with the ICU-bed capacity, our results suggest that 6 over 14 days is the maximal threshold that can be endured by the Italian healthcare system. The prediction shows the median value for the epidemic curve, as well as the first and the third quartiles representing virtuous or bad habits of the Italian population, respectively. The second peak reaches its maximum value in April 2021, suggesting that the timely application of social distancing measures can delay the epidemic curve with respect to the 2020 peak. However, the delay gives rise to an increased number of infected cases during the summer, resulting in a 2022 epidemic wave that is higher than 2021. A similar prediction is obtained when the intermittent lockdown is completely driven by daily infected cases. We considered the seasonal component, and we imposed the lockdown every time the daily number of infected exceeds 500 cases for at least three consecutive days, as in Figure 4 . Once started, the lockdown is enforced without interruptions until the simulated daily number of infected decreases to less than 250 cases for at least one week. Extended Figure 4 shows that the median curve of the case-driven lockdown is slightly better than the one presented in Figure 4 , suggesting that the number of infected people, in the long run, is lower. However, in a case-driven lockdown policy, the number of "open days" and "close days" (strict lockdown for the population) is not constant over the time, thus becoming harder to manage for the population. With the thresholds of 500 and 250 daily cases driving the lockdown, our results show a mean value, computed on the overall intermittent lockdowns, of 8 "open days" alternated with 12 "close days", suggesting a slight advantage over the 6 "open days" and 8 "close days" in Figure 4 . During the analysis, we also tested a scenario where only 150 daily cases were required for the opening policy, but no significant differences were observed in the model dynamics, suggesting that our model predictions seem to be poorly affected by the chosen opening threshold.

Our results suggest that, in the case of future seasonal outbreaks, an intermittent lockdown could be a valid alternative to the extensive lockdown enforced during the first epidemic wave. The intermittent strategy relieves the economic and psychological burdens on society, and is still able to contain the epidemic and avoid an exponential uncontrolled outbreak. Also, we had highlighted the importance of non-pharmaceutical interventions, such as social distancing measures, frequent hand washing and wearing protective masks. As we show in Figure 3 , these precautions seem crucial to contain the epidemic spread, even without a lockdown. Hence, our study suggests that keeping correct behaviours represent the first line of defence against new COVID-19 outbreaks, both local and imported ones 33 .

The main limitation of these results is that all the long-term predictions may change due to the potential approval and distribution of new pharmaceutical interventions, either in forms of drugs or vaccines [34] [35] [36] . Also, our results represent the overall situation in Italy, without considering that each region had a very different first epidemic wave. To locally describe the epidemic, regional calibrations of the model as well as spatial interactions between the regions should be considered 6, 37, 38 . However, besides our analysis of the Italian situation as a case study, our time-varying protocol for model calibration is a general framework, and it may apply, with just minor efforts, to any available data on COVID-19. Another possible extension of this work could be the introduction of age classes 39, 40 . COVID-19 has a different impact on the demographic tissue, with the younger population much less severely affected than the elderly; however, no public repository is currently available about daily age-structured cases, which would be needed for the model calibrations. Even considering these limitations, our analysis based on the SIDARTHE model provides insights into the dynamics of the spread during the various phases of the Italian first epidemic wave. Moreover, it allows us to test multiple scenarios informing the policymakers on possible strategies to contain resurgent outbreaks at a nation-wide scale.

Extended Figure 1: Network . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 16, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 16, 2020. . https://doi.org/10.1101/2020.07.15.20154500 doi: medRxiv preprint Figure 4 ).

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 16, 2020. . 

The original SIDARTHE model 1 divides the entire population into 8 mutually exclusive compartments describing different infection stages: each individual can be either susceptible (S), undetected asymptomatic or pauci-symptomatic infected (I), detected asymptomatic infected (D), undetected symptomatic infected (A), detected symptomatic infected (R), detected life-threatened symptomatic infected (T), recovered (H) or dead (E). In this contribution we also included an explicit subdivision of the recovered patients between those who had been previously detected (H1) and those who had not (H2), which eases the parameter calibration for the model. Hence, the SIDARTHHE model we actually considered is a closed-form ODEs system involving 9 variables that closely retraces the original model: Extended Figure 1 visualizes the flows among different compartments, representing infection stages, which govern the epidemic evolution. We refer to the original work 1 for more details on the model parameters.

The data source used to estimate the model parameters is the GitHub page of the Italian Protezione Civile, where new data on the number of detected quarantined people, detected hospitalized patients, ICUs patients, recovered people (confirmed by two consecutive negative swabs) and deaths are loaded daily. We used data from February 24 th , 2020 to June 24 th , 2020, for a time period of 122 days.

We defined an ad hoc fitting strategy to manage the shifting policies occurred in the analyzed time interval by means of parameter updates reflecting those changes. The idea is that model parameters are not constant in time, but rather evolve in response to changes in policies and behaviors (e.g., washing hands more frequently and wearing face protective masks), and evolutions of the disease. In accordance with Italian national decrees and general measures, March 1 st , 11 th , 22 nd , 28 th , April 10 th , May 4 th and 18 th have been selected as critical days to mark the parameter updates. In details, we estimated the model parameters for In this way, we reduced the risk of bias for the computed parameter estimates due to the length of some subintervals shorter than the COVID-19 incubation period. In addition, prior knowledge on standard epidemic mechanisms as well as on COVID-19specificities has been used to constrain the objective function. In particular, we made assumptions in order to model the greater probability to be detected if symptoms are visible (parameter > parameter ) and the relationships among infection rates, with undetected asymptomatic infected more likely to spread the epidemic than both undetected symptomatic (parameter > parameter ), which are supposed to stay at home to recover, and detected asymptomatic/symptomatic (parameter > parameter > parameters and ), which are assumed to be quarantined. We assumed the same contagion rate for the two classes of detected cases (parameter = parameter ) and the same healing rate for the asymptomatic classes (parameter = parameter ) and the symptomatic ones (parameter = parameter ). We also imposed a maximum value for R0 equal to 6 as reported in the literature 41 . Moreover, we estimated the initial value of those model variables that do not have a direct corresponding value in the available data (I, D, A, R and H2) under the assumption that the number of undetected cases was greater than the detected one.

For the calibration procedure, we used a global optimization algorithm, the CMA-ES 7,8 , a derivative-free evolutionary strategy for optimization problems. The estimates for the model parameters are computed in the range 10 "# − 1 (except for , and for which the lower bound was fixed to 5 ⋅ 10 "$ ), while for the initial model variables we imposed a range equal to 10 "% − 10 "& (as fraction of the Italian population). The value for parameters and was fixed to 0.125, corresponding to a mean of 8 days to develop the symptoms. We calibrated the model 100 times and worked with the resulting parameter distributions to compute the long-term predictions.

We estimated the actual day zero of the epidemic outbreak, when the first unrecognized cases occurred, with a "shooting" forward approach, which was preferred to the most common back integration technique because of the lighter computational burden. In particular, we selected an alleged, but reliable, number of undetected infected as initial conditions to be satisfied at day 0 and we used the estimated model parameters of the first subinterval (February 24 th -March 1 st ) to simulate the model. We argued that the parameters of such a subinterval were the ones that best suit our purposes since they are not affected by any countermeasure and hence can better reflect the unawareness situation. The simulations start at day 0 and terminate when all the model variables are close to the values estimated at February 24 th . The time interval needed to reach the termination criteria allowed us to find day zero on the calendar by going back from February 24 th for a number of days corresponding to the interval. Table 1 shows some results we obtained by varying the initial conditions fulfilled at day zero. Other combinations for the initial conditions, involving at most 10 patients, were considered to check the consistency of the estimations. All of them predicted the day 0 to lie between late November and early December. Table 1 : Different combinations of asymptomatic and symptomatic undetected cases were tested to predict the day 0 of the epidemic outbreak. Each parameter set computed with the repeated calibrations has been used to make the prediction, thus we . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted July 16, 2020. 

In all the long-term predictions we considered the first, second and third quartiles computed from the parameter distributions to represent an optimistic, median and pessimistic scenario, respectively.

We imposed a seasonal component for the contagion parameters in the long-term predictions. The four contagion rates ( , , ) are defined as: 

. For both the hyperbolic tangents, the is equal to two months while 12345(6 and 7869: correspond to the first of October and April, respectively. With this procedure, we were able to simulate a smooth increase of the contagion parameters starting at October 1 that requires around two months to grow from the smaller value * to the greater value # . On the contrary, we have a smooth decrease that starts on April 1 st and requires around two months to decrease from the greater value # to the smaller value * .

We implemented an intermittent lockdown 30 taking place during the seasonal outbreak of the epidemic (from October to June). The lockdown starts when the number of daily infected cases is above 500 for at least three consecutive days. During the period of lockdown, we alternate "close days" where a full lockdown is imposed to the population and "open days" where no lockdown is present. To simulate the alternation between close and open days, the contagion parameters are changed between the values * and # , respectively, as detailed in the previous section. However, in this case, parameters # , representing the contagion rates during the "close days" of the lockdown, are set equal to the mean of the contagion estimates during the last two subintervals of the extended lockdown (March 28 th -April 14 th and April 14 th -May 4 th ).

In our analysis, two different scenarios were tested for the intermittent lockdown. In the first scenario, we imposed a constant amount of open and close days and we tested different combinations of the two. In the second scenario, the lockdown is completely driven by the number of daily infected cases. A close policy starts every time there are more than 500 detected cases for at least three days. On the contrary, an open policy starts when there are less than 250 daily cases for at least one week.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted July 16, 2020. . https://doi.org/10.1101/2020.07.15.20154500 doi: medRxiv preprint

Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy

World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19)

Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts

Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period

The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science (80-. )

Spread and dynamics of the COVID-19 epidemic in Italy: Effects of emergency containment measures

Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation

Completely derandomized self-adaptation in evolution strategies

Inferring the COVID-19 infection curve in Italy

Estimating the undetected infections in the Covid-19 outbreak by harnessing capture-recapture methods

Estimation of COVID-19 outbreak size in Italy

Quantifying undetected COVID-19 cases and effects of containment measures in Italy: Predicting phase 2 dynamics

Universal weekly testing as the UK COVID-19 lockdown exit strategy

COVID-19: extending or relaxing distancing control measures

Covid-19 mass testing facilities could end the epidemic rapidly

The early phase of the COVID-19 outbreak in Lombardy, Italy. arXiv

Tracing DAY-ZERO and Forecasting the COVID-19 Outbreak in Lombardy, Italy: A Compartmental Modelling and Numerical Optimization Approach

Assessment of the SARS-CoV-2 basic reproduction number, R0, based on the early phase of COVID-19 outbreak in Italy

First cases of coronavirus disease 2019 (COVID-19) in the WHO European Region

SARS-CoV-2 has been circulating in northern Italy since December 2019: evidence from environmental monitoring

SARS-CoV-2 seroprevalence trends in healthy blood donors during the COVID-19

Mutated COVID-19 may foretell a great risk for mankind in the future

A virus that has gone viral: amino acid mutation in S protein of Indian isolate of Coronavirus COVID-19 might impact receptor binding, and thus, infectivity

The COVID-19 Pandemic: A Comprehensive Review of Taxonomy, Genetics, Epidemiology, Diagnosis, Treatment, and Control

Spike mutation pipeline reveals the emergence of a more transmissible form of SARS-CoV-2

Outbreak reporting is expected and respected

Human coronavirus circulation in the United States 2014-2017

Epidemiology, Genetic Recombination, and Pathogenesis of Coronaviruses

The socio-economic implications of the coronavirus pandemic (COVID-19): A review

On Fast Multi-Shot Epidemic Interventions for Post Lock-Down Mitigation: Implications for Simple Covid-19 Models

The geography of COVID-19 spread in Italy and implications for the relaxation of confinement measures

COVID-19, an emerging coronavirus infection: advances and prospects in designing and developing vaccines, immunotherapeutics, and therapeutics

COVID-19 treatment by repurposing drugs until the vaccine is in sight

COVID-19: combining antiviral and anti-inflammatory treatments

Intermittent yet coordinated regional strategies can alleviate the COVID-19 epidemic: a network model of the Italian case

Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo'

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study