key: cord-0731913-3b95qvd5 authors: O'Brien, D. A.; Clements, C. F. title: Early warning signals predict emergence of COVID-19 waves date: 2021-06-26 journal: nan DOI: 10.1101/2021.06.24.21259444 sha: 57e0ae87d187913e1dc6dc8283aad0f5b806fcd9 doc_id: 731913 cord_uid: 3b95qvd5 Early warning signals (EWSs) aim to predict changes in complex systems from phenomenological signals in time series data. These signals have recently been shown to precede the initial emergence of disease outbreaks, offering hope that policy makers can make predictive rather than reactive management decisions. Here, using daily COVID-19 case data in combination with a novel, sequential analysis, we show that composite EWSs consisting of variance, autocorrelation, and return rate not only pre-empt the initial emergence of COVID-19 in the UK by 14 to 29 days, but also the following wave six months later. We also predict there is a high likelihood of a third wave as of the data available on 9th June 2021. Our work suggests that in highly monitored disease time series such as COVID-19, EWSs offer the opportunity for policy makers to improve the accuracy of time critical decisions based solely upon surveillance data. Unfortunately, the causes of disease emergence and/or re-emergence often appear idiosyncratic [12] , requiring the use of context-specific models [13, 14] . These models are powerful tools and have become keystones during the COVID-19 pandemic response, but are restricted by data availability [15, 16] , potential for lack of transparency [17] and our mechanistic understanding of the system [18] . Due to these difficulties, there have been alternative suggestions to consider disease emergence as critical transitions [19] where a forcing pressure, such as host movement or pathogen evolution, drive the system towards a threshold. If emergence is considered in this manner, then a suite of alternative methods based upon the concept of critical slowing down (CSD) become applicable in the identification of transitions in disease and COVID-19 dynamics. Critical slowing down represents the compromised ability of a system to recover from perturbation as it approaches a threshold, at which a small perturbation in the state triggers a positive feedback loop and a system shift [20] . From this phenomena, various mechanismindependent and summary statistic based indicators have been identified across a range simulated [21, 22] and empirical [23, 24] studies. In disease systems specifically, CSD was established as tracking a transcritical transition in the effective reproductive number, R eff , [25] or number of secondary infections an infectious individual causes. Below one, secondary infection is unlikely whereas above one, transmission is common and an outbreak of sustained disease occurs. The period where R eff increases towards one is represented by a region of 'stuttering'/cryptic transmission [26] during which, CSD also increases [25, 27] . The 'Early Warning Signal' (EWS) summary statistics based upon CSD will therefore precede the rapid increase in cases at the beginning of a disease outbreak; before R eff exceeds a value of one [28] . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2021. ; https://doi.org/10.1101/2021.06.24.21259444 doi: medRxiv preprint Early warning signals have been shown to predict the emergence of diseases in both empirical and simulation studies [25, 29, 30] . Variance, autocorrelation at lag1, and decay time all increased prior to malaria resurgence in Kenya [29] and before the initial emergence of COVID-19 in seven of the nine countries assessed [31] . However, a critical use of EWSs is not only to predict initial emergence, but also successive re-emergences, with no method currently developed to achieve this. Thus, COVID-19 represents a unique opportunity to test the efficacy of EWSs to predict multiple sequential outbreaks by assessing the wave-like dynamics expressed in most countries [32] . Here we test whether CSD based EWSs can predict sequential transitions in COVID-19 daily case numbers, developing a novel methodology to detect and subset data into successive waves of infection. Using the suggested framework, we show evidence that EWS can be identified prior to each of the two previous COVID-19 waves experienced by the United Kingdom and predicts a third as of the data available on 9 th June 2021. Our results provide suggestions on how to use EWSs in a management scenario, where decisions must be made as data are collected, rather than post hoc. Daily COVID-19 case data was collected from the UK government's coronavirus data portal (https://coronavirus.data.gov.uk) and World Health Organisation's dashboard (https://covid19.who.int/info/), spanning 30 th January 2020 to 9 th June 2021. Uniquely, we analysed positive, daily COVID-19 cases rather than cumulative cases as performed by other studies [31] , thus allowing us to attempt the prediction of sequential COVID-19 outbreaks. Case data has previously been criticised for its inaccuracies resulting from incomplete testing . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2021. ; https://doi.org/10.1101/2021.06.24.21259444 doi: medRxiv preprint and cryptic cases [7], but we wanted to explore the viability of EWSs using the most universally collected and interpreted data type. The data were consequently analysed in its raw form with no detrending performed. To assess the instantaneous increase and decline of COVID-19 cases and define 'waves', generalised additive mixed effect models (GAMMs) were fitted to daily cases using the R Early Warning Signal Calculation . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Each indicator was normalised by subtracting its expanding mean from its calculated value at time t before dividing by its expanding standard deviation [24] . A composite metric was then constructed by summing all individual indicator values calculated per time point. An EWS was therefore considered present when the composite metric exceeded its expanding mean by 2σ [38] . The 2σ threshold was chosen based upon its equivalency to a 95% confidence interval and repeatedly favourable performance compared to other threshold levels [24, 39] . As the expanding mean of the indicator is the basis of assessment, a previous wave will often mask the appearance of second (Supplementary Figure 1) . Consequently, once a wave subsided, as assessed by GAMM first derivatives, the data were cut and the EWS assessment restarted, truncating the time series from the point of wave end. This resulted in a series of EWS assessments, each for a specific wave and independent from previous waves. Similarly, the expanding window approach is susceptible to false positive signals at the start of assessment due to the short time series length and high variability when few data points are supplied to the algorithm. To mitigate this, a seven time step burn-in period was introduced (the upper duration of COVID-19 incubation) before metric strength was assessed to 'train' the signals. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) To compare the robustness of indicators, a 'warning' was considered present under two scenarios. First, when the initial signal was identified, and second (to represent a conservative assessment) whether signals were detected for seven consecutive time steps. Evidence suggests that a persistent signal for two time steps is sufficient to reduce the frequency of false positives [43] , but as that work focussed on a systems' recovery, a stricter persistence was assessed here. From the calculated indicators, we present both the individual indicator strengths over time as well as the difference between the time-of-first-detection and the onset of each wave estimated from the GAMM derivatives. We also identify the superior indicator or combination of indicators for specifically pre-empting COVID-19 waves. Generalised additive mixed effect models predicted two significantly increasing regions in daily UK COVID-19 cases. These regions correspond to the onset of Waves 1 and 2 beginning on 17 th March 2020 and 9 th September 2020 respectively. From this prediction, two restarts of the EWS analysis were performed on 18 th June 2020 and 1st April 2021 following the subsidence of each wave. All early warning signal (EWS) indicators, excluding return rate (rr), increased and exceeded the 2σ threshold at least once prior to the appearance of COVID-19 Waves 1 and 2 ( Figure 1 , Supplementary Table 1a) . Time-of-first-detection was consistently two weeks or earlier than the predicted wave onset, with Wave 2 being pre-empted earlier than Wave 1 (Supplementary Table 1 ). No third wave was strongly identified by the GAMM first derivatives but the majority of indicators indicate the potential for an oncoming third wave (Figure 2a) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 'Match' represents identical time-of-first-detection and time-of-wave-onset, 'miss' being no warning despite the presence of a wave, 'post' is the identification of a warning after the time-of-wave-onset, 'prior' is the identification of a warning before the time-of-waveonset and 'unknown' representing the identification of a warning with no apparent wave (i.e. potential for an oncoming wave). Indicators include: autocorrelation (acf), return rate (rr) and variance (SD). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2021. ; https://doi.org/10.1101/2021.06.24.21259444 doi: medRxiv preprint Similar results for international COVID-19 cases were observed (Supplementary Figure 2) with variance remaining the most robust predictor of both emergence and re-emergence. Successive waves remain well predicted in many countries although in nations where daily variation is high and the time between waves is short, EWSs become delayed and do not preempt the wave onset. These results show that Early Warning Signals (EWSs) based on the theory of critical slowing down (CSD) are sufficient to predict the emergence, and subsequent re-emergence, of COVID-19 in the UK and abroad. This is strengthened when time series are subset into individual waves, and the timing of subsets unbiasedly estimated using a GAMM approach. As all indicators increased prior to the onset of COVID-19 waves, there is agreement with the prediction that CSD is observable in the disease's dynamics [25] and so represents a viable tool for policy makers to inform timely decisions. With variance responding consistently under the strictest definition of a warning, this implies that the indicator is the most reliable for pre-empting disease. This is congruent with previous disease [29, 44] and ecological [24, 41] findings. The weakness of autocorrelation was unexpected considering those studies also identified autocorrelation as a reliable indicator, yet the robustness of the triple composite EWS regardless highlights the benefit of the composite metric approach developed by [38] over individual indicators. The degree of preemption reported here is particularly important as, statistically, the lag between transition and disease emergence is not fully understood [28] and the supposition that earlier nonpharmaceutical interventions (NPIs) may have dramatically improved death rates early in the pandemic [10]. We therefore suggest that composite EWSs are detectable sufficiently prior to . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2021. COVID-19 waves to be suitable in the current monitoring toolbox for this and potential future pandemics. Although this study provides evidence for the usefulness of EWS indicators, their application is often hampered by the quality of data available [45] . Epidemiological data can be particularly problematic due to the reporting style of practitioners. Case data can be aggregated in to weekly or monthly counts with the exact date of infection/recovery of an individual unknown, or, during the emergence of disease specifically, many cases will be cryptic due to a lack of testing or symptoms [7] . Whilst EWSs based upon disease incidence often display unique behaviour compared to the same indicators based upon disease prevalence, this study suggests EWSs can still detect critical transitions in daily data via an expanding window and is supported by other studies implementing the alternative rolling window approach [30, 46] . COVID-19 case data from the UK does provide a best-case example for EWS assessment however, due to the frequency of testing and defined periods of stationarity enforced by NPIs, but as many governments alter their approach towards future pandemics [47], the quality of data may become universal. Even with contemporary levels of reporting variation between countries, EWSs do consistently detect the onset of waves (Supplementary Figure 2) , though the strength of pre-emption varies. In conclusion, the results of this study support the use of composite CSD indicators during COVID-19 monitoring, and likely other diseases with re-emergence dynamics. The two week plus pre-emption of wave onset is encouraging for their informing of policy, namely as a 'preliminary analysis' indicative of a system at risk and requiring intervention consideration. The detection of EWSs could thus be followed by efforts to specifically identify the underlying variables of the system that are changing. Although we advocate the use of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2021. ; https://doi.org/10.1101/2021.06.24.21259444 doi: medRxiv preprint sequential EWS assessment for disease monitoring, it is harder to apply this approach in multi-dimensional systems where a stationary period is likely unidentifiable. Ecosystems, for example, are constantly fluctuating in response to intrinsic or extrinsic drivers [48] , with it unclear the minimum length of time series to definitively identify the system's trend [49] . Nonetheless, if periods of stationarity can be identified, we believe sequential assessment is necessary for EWS usage and prevention of bias from historic transitions. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2021. ; https://doi.org/10.1101/2021.06.24.21259444 doi: medRxiv preprint Emerging virus diseases: can we ever expect the unexpected? The challenge of emerging and re-emerging infectious diseases 2020 How will country-based mitigation measures influence the course of the COVID-19 epidemic? and leading indicators of elimination Early warnings of regime shifts: A whole-ecosystem experiment. Science (80-. ). 332, 1079 Including trait-based early warning signals helps predict population collapse Detecting critical slowing down in highdimensional epidemiological systems Inference of R0 and transmission heterogeneity from the size distribution of stuttering chains Forecasting infectious disease emergence subject to seasonal forcing Waiting time to infectious disease emergence 2020 Early warning signals of malaria resurgence in Kericho Prospects for detecting early warning signals in discrete event sequence data: Application to epidemiological incidence data Anticipating the novel coronavirus disease (COVID-19) pandemic. Front. Public Heal Conditions for a second wave of COVID-19 due to interactions between disease . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity We are thankful to the GW4+ FRESH Centre for Doctoral Training in FreshwaterBiosciences and Sustainability for their support of this project. This project was funded by a NERC GW4+ FRESH CDT PhD studentship awarded to DOB (NE/R011524/1).