key: cord-247144-crmfwjvf authors: Bodova, Katarina; Kollar, Richard title: Emerging Polynomial Growth Trends in COVID-19 Pandemic Data and Their Reconciliation with Compartment Based Models date: 2020-05-14 journal: nan DOI: nan sha: doc_id: 247144 cord_uid: crmfwjvf We study the reported data from the COVID-19 pandemic outbreak in January - May 2020 in 119 countries. We observe that the time series of active cases in individual countries (the difference of the total number of confirmed infections and the sum of the total number of reported deaths and recovered cases) display a strong agreement with polynomial growth and at a later epidemic stage also with a combined polynomial growth with exponential decay. Our results are also formulated in terms of compartment type mathematical models of epidemics. Within these models the universal scaling characterizing the observed regime in an advanced epidemic stage can be interpreted as an algebraic decay of the relative reproduction number $R_0$ as $T_M/t$, where $T_M$ is a constant and $t$ is the duration of the epidemic outbreak. We show how our findings can be applied to improve predictions of the reported pandemic data and estimate some epidemic parameters. Note that although the model shows a good agreement with the reported data we do not make any claims about the real size of the pandemics as the relation of the observed reported data to the total number of infected in the population is still unknown. The coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accompanied by an unprecedented challenge for its mathematical modeling. Most of the difficulties stem in an extremely high level of uncertainty in available data: • Methodology for reporting of all types of data (number of confirmed positive cases, hospitalizations, recovered individuals, and even confirmed deaths) is not systematic and varies from one country to another [1] . • Different types of tests and test protocols used for COVID-19 detection have their own limitations both in sensitivity and specificity; testing procedures differ in methodology of a sample selection and in a testing sample size in different countries. In addition, different types of tests detect different phases of individual infections and their results differ based on the clinical stage of the infection [2] . • A relation of the reported data to the real (unobserved) number of infected in a population is not understood and the estimates for the ratio of total cases in population to the number of observed cases vary even in the order of magnitude [3, 4, 5, 6, 7, 8, 9, 10] . The extent of uncertainty in data prohibits designing and validating mathematical models that would provide verifiable accurate description of dynamics of the pandemic. That in turn has serious consequences on pandemic spread control and efficient epidemiological decision making. Particular difficulty for mathematical modeling is that traditional compartment type models, often referred to as "SIR models" with Susceptible -Infected -Recovered compartments, that succeeded in an accurate description of previous epidemics, produce scenarios that do not match closely the observed data [5, 11] and fail to capture significant trends observed in data, see also [12, 13] for other early models. While it is still too early to call whether these models fail to capture the real dynamics of pandemics (as the observed data may not completely correspond to number of infected in the whole population) there is an urgent need to understand the available data and its relation to the SIR models. We consider very important to emphasize the limited scope of our analysis as in the current literature some of these limitations are blurred and may eventually lead to misinterpretations. To overcome the difficulties with mathematical modeling and a high degree of uncertainty in data we propose a simple descriptive model that captures the dynamics of the observed data rather than a detailed mechanistic model of the pandemic dynamics in the whole population. Therefore our results need to be interpreted with high precaution. We only present a systematic mathematical description of the observed reported data. We provide neither an explanation of the pandemics spreading, nor any claim about the real scope of the pandemics. However, if one conjectures that the observed reported data capture the extent of the pandemic size in the real population (e.g., if data systematically report a fixed percentage of the total infected population), then our results can be used to identify the nature of the emerging trends in pandemics spreading in individual countries. Furthermore, we only model the first wave of the pandemic and we do not make any predictions about the extent and timing of next waves, although we discuss how our method can be used to study effect of next epidemic waves. We are aware that we ignore multiple key factors that may influence the relation between the observed data and the real epidemic size (the level of detection of infected by testing, delays in test reporting, variation in clinical aspects of the infection, etc.). Note that in the text we use the term COVID-19 for all reported infections based on a positive PCR test and thus we do not distinguish between the presence of the virus in the respiratory tract of an infected and the disease it causes. Through the whole text we model the time series of the total number of active cases that is the difference of the reported total number of confirmed infections and the sum of the total number of reported deaths and recovered cases. We have identified a universal trend in the data reported by individual countries that helps to improve predictions for the final observed epidemic size and the related time scales. The universal trend is observed despite inhomogeneities and uncertainties in the data time series and various types and levels of mitigation policies applied. There is a transition in time series from an exponential growth (EG) to a polynomial growth (PG) and at a later stage to a combined polynomial growth with an exponential decay (PGED) in the number of active cases across almost all countries world-wide (with a sufficient size of current data). Our choice of the form of the trend in data is motivated by the theoretical results of [14, 15, 16, 17] and by the data analysis performed in [18, 19] that relate the observed transition to structural changes in the population contact networks. We analyze the reported data and estimate the parameters for individual countries. This allows us to categorize the countries into groups according to their advance through the first wave of the pandemics, and also to detect a possible divergence into a possible upcoming second wave in some countries. We rank a selected group of countries according to one of model parameters that captures their ability to identify, test, and isolate infected individuals from the rest of the population and thus prevent further spreading. We also provide a reconciliation of the PGED regime with the SIR model that allows to build PGED directly into existing SIR models despite the fact that PGED is inconsistent with the SIR models (in their traditional form). The PGED regime corresponds to an explicit algebraic decay of the relative reproduction number R 0 in time t in the form R 0 ≈ T M /t with a constant T M . The trend is in general agreement with the current estimates of the evolution of R 0 in many individual countries [20] ; see also [21] for a data supporting the observed dependence. A model based on PGED regime can also aid forecasting of the future dynamics of the observed data including an analysis of eventual next epidemic waves. The keystones of mathematical modeling of epidemic spread are the compartment based models also referred to as SIR models introduced by Kermack and McKendrick [22, 23, 24] , see also [25, 26] for an extensive literature overview. We only review basic properties of the SIR model relevant for our further analysis. The basic SIR model describes the dynamics of susceptible (S = S(t)), infected (I = I(t)), and recovered (R = R(t)) populations at time t by the coupled system of ordinary differential equations Here N is the total population, β > 0 is the infection rate, and γ > 0 is the removal rate of infections. The key assumptions behind this mechanistic model are • the population is well mixed, thus likelihood of a contact and transmission of the infection from any infected to any susceptible is the same, • the populations S, I and R are large enough to be well approximated by the real variable instead of integers. The basic model can be readily extended to account for an incubation period (SEIR model), for age structured and geographically structured populations, etc. Also, the assumption of the constant total population can be removed. However, the well mixed population assumption cannot be completely removed unless one considers SIR models on networks. In that case a good knowledge of the underlying contact network is necessary to calibrate the model to any type of data. The current extent of epidemic spread in the system is typically measured by the relative reproduction number R 0 = R 0 (t) that is related to the instantaneous rate of growth of the infected population. Equation 2 can be written as dI/dt = γ(R 0 − 1)I for Thus R 0 = 1 corresponds to the epidemic peak, the tipping point that characterizes the moment when the population of infected individuals is stationary, dI/dt = 0. 1 Note that dI/dt > 0 for R 0 > 1 and dI/dt < 0 for R 0 < 1 (for I > 0). For practical reasons related to the COVID-19 outbreak it is useful to rewrite the expression for R 0 as Here s = s(t) = S(t)/N is the proportion of the infected in the population at time t. The time scale T inf is associated with the removal of an average individual from the infected population, i.e., the typical length of the period during which the infected individual infects the susceptible population. It is related to γ by T inf = 1/γ. The infection rate β can be expressed as β = p/T sus , where T sus is the typical time scale associated with occurrence of interactions between susceptible and infected in the population where infection transfer may happen. The total number of transmission contacts between susceptible and infected population is given by sI/T sus . Finally, p is the probability of infection of a susceptible individual through a single random meeting with an infected. In the SIR model the number of active cases decreases due to a decrease of R 0 below 1. As β and T inf are constant it is achieved through a reduction of the proportion of the susceptible population s below the threshold 1/(βT inf ). That in general requires a significant decrease of the susceptible population through vaccination, gained natural immunity, infection (herd immunity) or through a long term quarantine of a large fraction of the susceptible population. However, the pandemic size apparently has not reached such a high level yet. Thus R 0 needs to be controlled through a decrease in β or T inf . The parameter β is an obvious candidate as strict public mitigation measures and social distancing decrease p and increase T sus . These measures are typically associated with a large economic cost and also parameters of this structural change on the level of the contact network are still unknown and their direct effect on R 0 cannot be accurately quantified. A decrease of the parameter T inf does not require widespread mitigation measures. Although ability to decrease T inf solely based on the observation of disease symptoms is limited from below by the length of the incubation period, T inf can be significantly reduced by active contact tracing. Until mitigation measures are applied we expect that an epidemic outbreak is governed by the SIR model 1-3. That implies an exponential growth (EG) of I = I(t) during the early stages of epidemic. For completeness we present the asymptotic behavior of (S, I, R) in 1-3 for t → 0 + . Let R 00 = R 0 (0) and (S(0), I(0), R(0)) = (S 0 , I 0 , 0). Then S(t) ≈ S 0 for t 1 and 2 reduces to dI/dt ≈ γ(R 00 − 1)I, i.e., Consequently from 3 and S + I + R = N we obtain that is consistent with 1. Note that EG was not observed in data in multiple countries (e.g., Slovakia, Lithuania) as these countries introduced mitigation measures at very early stages of the epidemic (zero or only a few confirmed cases of infection). After EG we observe a systematic transition to polynomial growth (PG) in data during early epidemic stages. We support our claim here by Fig. 1 that shows the number of active cases in selected countries during the pandemic outbreak (the data source [27] , reported data from May 5, 2020). The figure displays particular illustrative cases, see Section 7 for a systematic survey of all observed countries. For each country displayed we show the time series of active cases both on the semilogarithmic and on the double logarithmic scales. On the semilogarithmic plot we detect a divergence from the initial exponential trend while on the double log plot a polynomial growth (represented as a linear trend) is emerging after the initial EG. The basic SIR model is not consistent with a systematic approximate polynomial growth in the infected population. Otherwise, if I(t) ≈ At p , p > 0, on a interval t ∈ [t 1 , t 2 ], then dI/dt ≈ (p/t)I. Consequently from 2 From 1 it follows that However, the approximation 5 on the interval t ∈ [t 1 , t 2 ] of a significant length is inconsistent with the assumption p > 0. Note that modifications and extensions of SIR model can eventually agree with the observed polynomial growth phase in the infected population, however, we do not explore this question here. 5 Polynomial growth with exponential decay regime (PGED) During late epidemic stages in individual countries we systematically observe a transition to a universal scaling form (ansatz) for the number of active cases in all considered countries. In this phase the epidemic wave reaches its peak after which the number of active cases decays. The scaling has the form Here A, T G and α are the model parameters. The scaling 6 is a combination of a polynomial growth factor (t/T G ) α with an exponential decay exp(−t/T G ). Therefore we refer to 6 as polynomial growth with exponential decay (PGED). It was derived for the size of the infected population by Vazquez [14] who used a branching process to describe the epidemic dynamics in a population interacting on a scale-free contact network. Ziff & Ziff [18] use the PGED scaling on reported COVID-19 data and claim that public measures and social distancing enforced yield a fractal type contact network on which the epidemic transmission is strongly limited by the network topology. We note that the polynomial growth models in the literature discussed in Section 4.1 can be also adopted to account for the exponential decay factor as well by an inclusion of a constant rate of loss from the infected population (similarly as in [16] ). We use 6 to match the observed pandemic data, particularly the number of active infection cases as reported in [27] . Note again that no prefactors are used here so the model only describes the dynamics of the reported data. The function I = I(t) in 6 has a maximum at t = T M = αT G where it reaches the value P = A(α/e) −α T −1 G . Note that the inflection points of the function I = I(t) are located at T ± I = (α ± √ α) T G , particularly the time t = T − I plays an important role in the observed epidemic data as it corresponds to a moment at which the growth of the number of active cases reaches its maximum and starts to decrease. In the SIR model the ratio of the infected population at the inflection point (during the growth phase) and at the point of maximum is equal to 1/2. However, in 6 this ratio is given by α− √ α α α e √ α . This expression is equal to 1/2 for α . = 6.23. For many countries we observe α < 6.23 and thus the slowdown of the epidemiological curve (inflection point) occurs at (often at a significantly) smaller fraction of the population than predicted by the SIR model. An interpretation of the parameters A, α and T G is not completely straightforward and thus we reparametrize the model by a parameter combinations that correspond to naturally observed quantities. The equation 6 rewritten using the parameters P, T M and α has the form The model parameters in 7 were inferred by nonlinear least squares optimization in MATLAB c , see Tables 12 for the values obtained for individual countries. Polynomial and exponential decay factors in 6 motivate to use logarithmically rescaled data (log I(t) or both log I(t) and log t) in optimization. However, we use non-rescaled values instead as the PGED trend is present only in later phases of the epidemic, particularly after the implementation of mitigation measures and social distancing (with a delay for its manifestation in the reported data). Using non-rescaled data allows us to globally fit the whole data set as the early epidemic data has only a small weight in the optimization due to its relative magnitude unlike in the case of the rescaled data. The fit thus does not require any prior (or fitting) for the time of transition to PGED. 2 In general, fitting polynomial growth to data is very sensitive to a choice of the origin of the fitted time series. Therefore we have set the data starting point in all considered countries systematically. To eliminate the effect of stochasticity in small data we have disregarded all the data points in time series before the infected population in country reached a set threshold N 0 . For Italy we set the threshold to 200 while for all other countries the threshold was normalized -proportionally increased or decreased in agreement with the ratio of population size of the studied country to population of Italy (with minimum threshold set to 10 for countries with a small population). To eliminate obvious irregularities in daily reporting we smooth the studied data. The irregularities appear naturally as the testing procedures and protocols impose systematic nonuniformity: populations in large clusters are discovered simultaneously, there is a systematic delay in contacttracing testing, limited testing capacity, batch testing, etc. We use linear smoothing on the increments and decrements of the number of active cases via moving averages through seven days (weights: (1, 3, 6, 7, 6, 3, 1)/27) that corresponds to three iterations of local averaging of three consecutive days. 3 Polynomial growth is not consistent with the SIR model for an extented period. However, we consider very instructive and useful to reconcile the SIR models with the PGED regime as such a reconciliation would allow to build PGED directly into the SIR type models. Therefore we study whether the form (equivalent to 7) can solve a SIR type model. Equation 8 implies We can now compare 9 with 2 and identify Thus, the exponential decay e −t/T G term in 8 can be seen as a consequence of the infected removal rate γ = 1/T G in the SIR model. Furthermore, the term βS/N corresponds to α/t. Finally, we express this dependence in terms of the relative reproduction number R 0 : where T M = αT G = α/γ is a constant, see Section 5 for its interpretation as the time of the epidemic peak. Therefore PGED implies an algebraic decay of R 0 in the SIR model. Note that this result is in agreement with the argument about the reduction of infection transmissibility in [21] and also with the R 0 analysis in study [35] of the impact of non-pharmaceutical interventions in European countries. There the change of the reproduction number R 0 is modeled as an average percentage reduction per specific type of intervention and the analysis is based on a Bayesian approach using data on number of infected and number of deaths, which are more reliable. Also note that as t → 0 + the relative reproduction number diverges to ∞. This is in an agreement with the model of [15] where the fat-tail power-law distribution of R 0 in the population has initially an infinite mean that with the epidemic outbreak reduces to finite values. Average R 0 further reduces during the epidemic by a gradual elimination of the individuals with a high values of R 0 from the susceptible population -the individuals with high values of R 0 are easily infected and removed from the susceptible class as the first. Based on the presented analysis we expect three phases of a single epidemic wave. • Initial Exponential Phase. During the initial phase (with no applied mitigation measures) we expect the exponential growth of the infected population I = I(t) as discussed in Section 3: • Polynomial Growth Phase. After the initial phase we expect a short transient phase during which I = I(t) smoothly transitions from the initial exponential phase to the polynomial growth phase described in Section 4. During this phase • Final Polynomial Growth with Exponential Decay Phase. After an introduction of the mitigation measures and social distancing and a subsequent delay necessary for their appearance in the reported data we expect a transition of I = I(t) from the PG phase to PGED phase described in Section 5. During this phase Note that is some countries the mitigation measures were applied when the number of infected was low or zero (Slovakia, Lithuania) and the initial exponential phase was too short to be identified in the data. Also note that during the PG phase the function I = I(t) is convex (p > 1 for all observed countries). For such a function to reach its local maximum it must first go through an inflection point. Unless the inflection point appears during the transient phase between PG and PGED the PG phase must connect to the PGED phase before the PGED phase reaches its inflection point, i.e., before the time t = T − I . This phenomenon was also observed in data of all surveyed countries and it helps to improve predictions for countries that have not reached the PGED phase yet. We conducted a systematic survey of COVID-19 pandemic data ([27], the last reporting day May 9, 2020) for all countries where the time series are sufficiently long to display a consistent trend (in total 118 countries). In each country (together with all its territories) we consider the number of active cases equal to the total number of reported confirmed infections decreased by the sum of reported total number of recovered and deaths. For a characterization of the epidemic progression we use the following phases: the initial exponential phase (EG), the polynomial growth phase (PG), and the polynomial growth with exponential decay phase (PGED). The final PGED phase has two checkpoints, the inflection point (I) and the epidemic peak-the point of maximum of the active infected population (M), after which the number of infected consistently decreases (D). Our results are presented in Tables 1-3 that summarize the stage of the epidemic in all surveyed countries. If a country reached the PGED regime we report the estimated values of the related PGED parameters. See Section 5.1 for the methodology and remarks on exceptions made for some individual countries. We support our results in Fig. 1 -8 that display data from selected countries. The figures show total active cases data time series for countries that are organized by the epidemic phase. For each country presented we show the data on both linear and semilogarithmic plot (countries in EG and PG phase) and also on double logarithmic plot (countries in PGED phase). For the countries in the PGED regime we also show the best PGED fit to the data. 4 Country In Figure 1 we present two groups of countries in the early stages of the epidemic: the countries in the EG phase (Afghanistan, Bolivia, Colombia, El Salvador, India) and countries in the PG phase (Argentina, Indonesia, Qatar, Russia, Somalia). To demonstrate an evidence of EG and PG, respectively, we compare the recent data with a straight line in the respective plot. 7.4 Countries in the PGED phase close to and past the epidemic peak (05-May-2020) Selected countries close to and past the epidemic peak are displayed in Figure 4 and Figures 5-6, respectively. Consistent approximate exponential decay in the number of reported active cases in many countries may serve as a sign of a successful strategy against the further spread of the coronavirus. However, due to factors as abatement of the strict mitigation measures, fatigue of following social distancing, or simply due to reintroduction of the virus into the community a further spreading may occur. Such a trend that is demonstrated by a sudden slow down of the exponential decay (Austria, Australia, Vietnam) or even a sign of the next epidemic wave (Azerbaijan, Burkina Faso, Chile, Djibouti, Iran, Iraq, Jordan, Kyrgyzstan, Lebanon, Madagascar) also appears in the reported data, see selected countries in Figure 7 . For Singapore we only model its second epidemic wave. The parameter T G of PGED characterizes the typical time scale of removal of infected individuals, particularly those who would be eventually tested positive for SARS-CoV-2 infection by a PCR test, i.e., the time period during which an average infected individual can infect others (in susceptible population). A smaller value of T G correspond to a fast decay after a country reaches its epidemic peak while large values of T G indicate very flat epidemic peaks and thus a very slow gradual decay of the active cases. In practice, T G is influenced by multiple factors, among them ability to identify, test and quarantine positive cases in the population and contact-tracing procedures play a prominent role. While a larger and better selected infection testing sample can significantly decrease T G in countries with a higher degree of epidemic, in countries with a small number of active cases finding and testing a small number of infected in the whole population can be very difficult. Therefore contact-tracing and a prevention of an import of infections from other countries can be the key measures to lower the value of T G . See Fig. 10 for the graphical display of the sorted values of T G for countries that are at or beyond the epidemic peak. Countries classified as undergoing the second wave are not included in the plot as a presence of apparent second wave may be eventually a sign of spurious data. Note that the countries that are currently close to their epidemic peak have higher values of T G than countries that are already further in the decay phase (with the exception of Jamaica). The data suggest that countries with a small values of T G (Mauritius * , Iceland, Ireland, Australia, Austria, New Zealand) are very efficient in testing and isolation of the individuals who will be tested positive thus preventing them for further spreading of the infection. On the other hand, the data for countries with large values of T G (Slovenia, Norway, Uruguay) do not show an indication of an efficient testing and isolation of infected (or they may not properly report recovered cases data). However, this interpretation needs to be taken with a caution with regard to the note of precaution in Section 1.1. Particularly, the interpretation of T G in countries marked with * in Fig. 10 and Tab. 1-3 can be influenced by the special adjustment of the time series mentioned in previous sections. The simple PGED model, i.e., the universal scaling 7 and nonlinear fitting of the parameters from the data, can be used for as a predictive tool for the number of the reported active cases, particularly in countries in the growth phase. Once again keep in mind the note of precaution we formulated in Section 1.1. No verifiable connection of the number of active cases to the total number of infected in the population was established so far. Therefore all the predictions only concern the reported data. We present a performance of the predictive capabilities of the PGED model using the available data for eight selected countries (Belgium, Belarus, Czechia, Israel, Italy, Portugal, Switzerland, and US) that are in different epidemic phases and also display a variable accuracy of predictions. Testing was performed by comparing the values predicted by the PGED model (based on the incomplete data in which we have removed up to 15 data points from the end of the time series) to the withheld data. For each choice of the number of withheld days we have calculated the 95% confidence interval for the inferred parameters by MATLAB c function nlpredci.m. Particularly, we were interested in the confidence intervals for the location of the epidemic peak and the number of active cases at the peak. Predictive power of the model can be visualized by plotting the bounding boxes corresponding to the confidence interval around the inferred location of the epidemic peak. A good model should provide a consistent position of the confidence intervals with smaller boxes indicating a large degree of certainty in the predictions. Note that the analysis using the bounding boxes can be considered also a study of a sensitivity of the fit to the data. The countries in Fig. 8 are among those past the inflection point, at the epidemic peak or past the peak. In the latter case we withheld sufficiently many data points so that the estimates would be nontrivial. Overall, we have found that the short-term prediction (up to 1-2 weeks prior to the peak) tends to predict the location and value of the peak relatively well (Belgium, Portugal, Israel, Switzerland), however, the confidence intervals may be quite large in case not enough data beyond the inflection point is available (Belarus). When the inflection point (and the PGED regime) has not been reached yet, the information about an eventual exponential decay is not directly detectable in the data and the location and the value at the peak thus cannot be estimated, see the remark at the end of Section 6 for a discussion of a possible prediction improvement for countries in the PG phase, i.e., before they reach the PGED regime. We also illustrate two common situations: while for US, Italy and Portugal the prediction with less information underestimates the severity of the infection, for Czechia a forecast overestimates with less data. However, in both cases the fits were changing monotonically with the number of data points included in the analysis. Note that some of these trends may be due to variation in testing procedures and protocols. In Fig. 11 we also show how 95% confidence intervals can be calculated for the whole future data trajectory. This is not straightforward as the 95% confidence intervals for the estimated parameters are not independent. However, using the covariance structure of the inferred parameters it is possible to sample the parameters from the multivariate normal distribution and display the confidence intervals systematically for all times, as shown in the figure. The presented model can be also visualized using the web based tool [37]. It allows a general public to explore the data for various countries, including validation of the model predictions. Using the PGED model we have also successfully constructed a prediction on March 30, 2020, for the reported COVID-19 data in Slovakia that estimated a epidemic peak of about 1000 active cases in early May and that very closely matched the observed data, see the reference in the national media [38] . At that point in time the prediction differed by orders of magnitude from the predictions of compartment based models. Subsequently, the PGED model was incorporated into the main epidemic (SIR type) model in Slovakia maintained by the analytic unit of the Ministry of Health and that serves as a reference tool for the government crisis management team decision making during the COVID-19 outbreak in Slovakia [39]. The state of exponential decay of the infected population is often viewed by policy makers as the ultimate goal. However, without reaching the state of herd immunity the epidemiological situation is unstable with respect to secondary infections caused by rare infected individuals, new imported cases, and related superspreading events. We illustrate such a situation in the numerical example in Figure 9 . As an example we consider the reported data in Austria (over the period March 1 -April 15). For simplicity we match the data using the EG regime first (using the SIR model with inferred parameters β and γ). The simulation is initialized on March 1 (14 infected, 0 recovered, total population approx. 8.9 mil.). SIR dynamics is applied for the first 16 days reflecting the lack of measures in the early stages of the infection spread (note that the measures reflect in the reported data with a delay). After 16 days we match the rest of the data with PGED model (with inferred parameters P , α and T M and a continuous R 0 ; the values are similar to the parameters for Austria reported in Table 1 ). We then continue PGED model until the May 31 (90 days after the considered initial date). The number of recovered during PGED regime period is calculated from the 3 and the number of susceptible as the complement of infected and recovered in the population. The remaining population of infected individuals is estimated to be 20 on May 31. In the studied scenario we lift the mitigation measures completely on May 31. The dynamics then returns back to the standard SIR model and undergoes an EG phase. We study the impact of an early detection of the emerging situation (upcoming second wave) and consequent implementation of mitigation measures. We considered three alternatives: mitigation measures fully implemented after 1, 2, and 3 weeks (see shaded regions in Fig. 9 ). We observe that an early implementation of the mitigation measures dramatically reduces the next epidemic peak. A qualitatively similar progress can be seen in case of imported infections (we add 30 new infected cases at the time of released mitigation measures). The numerical results indicate how essential is to implement mitigation measures as early as possible, which requires efficient tools for an early detection of infected individuals. This example shows how the very simple PGED model can be used for analytics of the COVID-19 pandemic in individual countries. Reported data on COVID-19 display systematically identifiable regimes -exponential growth, polynomial growth, and polynomial growth with exponential decay. The observed universal scaling is a bit surprising as the pandemic mitigation and social distancing measures, the testing procedures and protocols, and many other aspects, vary significantly from one country to another. Nevertheless, the scaling appears to be a strong attractor of the reported active cases dynamics globally. An important feature of PG and PGED regime is that they both contribute to a slowdown of the epidemic growth in the reported data compared to expected dynamics driven by the SIR model. Note that we have only considered the active cases data but our preliminary data analysis confirms that the PG trends are present in the reported deaths and where available also in reported hospitalizations. Therefore we conjecture that the observed transition between the different regimes is comparable to phase transitions in physics, thus one expects that the universal scalings in data are a consequence of some unidentified fundamental properties related to the pandemic. Lack of the reliable and detailed data that would allow to discriminate between their eventual sources and a high complexity of the studied system that involves the virus/disease (its medical, chemical, and physical properties), behavior of individuals in population, and enforcement of the mitigation measures (see [2] for a summary of some related questions) do not allow to identify underlying factors for the observed PG and PGED regimes. Here we only list eventual candidates (or their combinations): • Significant changes in the effective contact network (social distancing and other mitigation measures) including low infection transmission probability in majority of contacts due to imposed safety measures (personal protection items as face masks, gloves, disinfectants, etc.). • Limitations of testing procedures and selection of the sample used for testing, including high level of uncertainty in test sensitivity (related to limit of detection and difficulties with sample collection) and specificity of all types of tests, over-and undersampling of various groups in testing, and failure to identify and test asymptomatic carriers; delays in test reporting. • Limited understanding of details of infection spread mechanisms, particularly the role of individual and temporal variation of viral load in infected individuals and their ability of infect others, related to various clinical stages of the disease; a lack of understanding of mechanisms of superspreading events. Note that the observed PG a PGED regimes are not in agreement with the traditional SIR type models that typically form a base for pandemic spread predictions published in the media unless their parameters are modified from their expected values, particularly a total population is decreased to a significantly lower effective total population. Therefore we conclude that although this work does not provide understanding of the full extent of the pandemic as it only models the reported data, it still may provide a useful source for decision making, for a comparison of different countries, or for economical predictions by governments, epidemiologists, and economists. Table 2 . The data is shown from the epidemic onset until May 5, 2020 [27]. Table 1 . The data is shown from the epidemic onset until May 5, 2020 [27]. Active cases Czechia (withheld data: 10) PGED fit Max. point Data Figure 8 : Predictions based on PGED model. For each country we remove the last n data points or last n data points before the epidemic peak (if already reached), 0 ≤ n ≤ D (see plot labels for the values of D) while always keeping the data points in black. PGED model parameters are inferred for each n. 95% confidence intervals for two PGED parameters (time and population of the epidemic peak) inferred by nonlinear regression are shown as bounding boxes around the mean in red. The presented data display: small uncertainty, small overlapping confidence intervals (A), large uncertainty, not enough data (B), monotonicity, additional data shifts the peak earlier (C), well predicted location of the peak and the data past (D), monotonicity, additional data shifts peak later (E, F, G), well predicted location of the peak but data past the peak not well captured (H).The data is shown from the epidemic onset to May 9, 2020 [27]. The inferred values of the parameter T G of PGED regime for countries close or beyond their epidemic peak (except those observing an apparent second epidemic wave). Lower T G corresponds to efficient identification, testing and isolation/removal of infected. Stars indicate modification of the data to account for data reporting irregularities, see the text. Figure 11 : Confidence regions for the all future times for selected countries. The data (black) are used for inference of the PGED parameters P , α and T M and their covariance matrix. The best fit is displayed (solid green line). Symmetric 95% confidence intervals obtained by sampling parameters from the mutivariate normal distribution with the same covariance structure are displayed as shaded regions. Methodological challenges of analysing COVID-19 daa during the pandemic The British Society for Immunology and The Academy of Medical Sciences (Akbar A ed.) 2020 COVID-19 Estimating the COVID-19 infection rate: Anatomy of an inference problem The Signicance of the Detection Ratio for Predictions on the Outcome of an Epidemic -A Message from Mathematical Modelers Preprints Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand Online report Coronavirus: Up to 70% of Germany COVID-19 Antibody Seroprevalence Estimating COVID-19 Antibody Seroprevalence in Suppression of COVID-19 outbreak in the municipality of Vo COVID-19 prevalence SORA institut IHME COVID-19 health service utilization forecasting team, Murray CJL 2020 Forecasting the impact of the first wave of the COVID-19 pandemic on hospital demand and deaths for the USA and European Economic Area countries medRxiv Early dynamics of transmission and control of COVID-19: a mathematical modelling study The Lancet Infectious Diseases This work has been supported by the Slovak Research and Development Agency under the contract no. APVV-18-0308 (RK) and by the Scientific Grant Agency of the Slovak Republic under the grants no. 1/0755/19 and 1/0521/20. The authors would like to thank Vlado Boža, Lukáš Poláček, Michal Burger and the modelling team of Institute of Health Policy for their useful comments and help. Particular thanks goes to Robert Ziff for an inspiration and Charlie Doering for pointing us in the right direction.