key: cord-284945-837qlk8y authors: Rahmandad, H.; Lim, T. Y.; Sterman, J. title: Estimating the global spread of COVID-19 date: 2020-06-26 journal: nan DOI: 10.1101/2020.06.24.20139451 sha: doc_id: 284945 cord_uid: 837qlk8y Limited and inconsistent testing and differences in age distribution, health care resources, social distancing, and policies have caused large variations in the extent and dynamics of the COVID-19 pandemic across nations, complicating the estimation of prevalence, the infection fatality rate (IFR), and other factors important to care providers and policymakers. Using data for all 84 countries with reliable testing data (spanning 4.75 billion people) we develop a dynamic epidemiological model integrating data on cases, deaths, excess mortality and other factors to estimate how asymptomatic transmission, disease acuity, hospitalization, and behavioral and policy responses to risk condition prevalence and IFR across nations and over time. For these nations we estimate IFR averages 0.68% (0.64%-0.7%). Cases and deaths through June 18, 2020 are estimated to be 11.8 and 1.48 times official reports, respectively, at 88.5 (85-95.3) million and 600 (586-622) thousand. Prevalence and IFR vary substantially, e.g., Ecuador (18%; 0.61%), Chile (15.5%; 0.57%), Mexico (8.8%; 0.69%), Iran (7.9%; 0.44%), USA (5.3%; 0.99%), UK (5.2%; 1.59%), Iceland (1.65%, 0.56%), New Zealand (0.1%, 0.64%), but all nations remain well below the level needed for herd immunity. By alerting the public earlier and reducing contacts, extensive testing when the pandemic was declared could have averted 35.3 (32.7-42.7) million cases and 197 (171-232) thousand deaths. However, future outcomes are less dependent on testing and more contingent on the willingness of communities and governments to reduce transmission. Absent breakthroughs in treatment or vaccination and with mildly improved responses we project 249 (186-586) million cases and 1.75 (1.40-3.67) million deaths in the 84 countries by Spring 2021. explanatory mechanisms: A) Differences in population density, lifestyle, and interaction patterns create variations in reproduction number. B) Risk perception, behavioral change and policy responses alter transmission rates endogenously. C) Testing is prioritized based on symptoms and risk factors, so detection rates depend on both testing rates and current prevalence. D) Limited hospital capacity is allocated based on symptoms, influencing fatality rates. E) Weather may have a role in transmission (7) while age, socio-economic status, and pre-existing conditions impact severity (8, 9) , with the poor and minorities suffering disproportionately (10) . Prior research has addressed different parts of this puzzle, including estimating the infection fatality rate (IFR) in well-controlled settings (11) or using statistical extrapolation (12) , assessing the asymptomatic fraction in smaller populations or through random testing (13, 14) , and estimating prevalence using random antibody testing in highly affected cities and regions (15) . Yet we lack a global view of the pandemic that is both consistent with these more focused findings and simultaneously explains the large variance in official cases and fatality across different countries. In this paper we build and estimate a multi-country model of the COVID-19 pandemic at global scale. Our model captures transmission dynamics for the disease, as well as how, at the country level, transmission rates vary in response to risk perception and weather, testing rates condition infection and death data, and fatality rates depend on demographics and hospitalization. Estimating this model using data from every country with more than 1000 cases for which testing data is available, we infer some of the key characteristics of COVID-19 pandemic globally and inform how alternative control policies may have played out across the globe. We use a multi-country modified SEIR model to simultaneously estimate the transmission of COVID-19 in 84 countries. The model tracks community transmission, excluding the global travel network and instead separately estimating the date of introduction of patient zero for each country. Within each country, the core of the model tracks the population through susceptible, pre-symptomatic, infected pre-testing, infected post-testing, and recovered states. Basic transmission dynamics reflect a classical SEIR process, with a baseline transmission rate, incubation period, and duration of infection. Building on this basic structure the model explicitly represents the dynamics of COVID testing, including demand for tests, sensitivity of PCR based testing, and allocation of test capacity between positive and negative cases, as well as the dynamics of hospital system capacity, both with concomitant effects on transmission and mortality. It also endogenously generates population-level behavioral and policy responses due to the perceived risks and government intervention. infected people and those with other risk factors (e.g. hospital workers) or similar symptoms due to other illnesses. We assume, based on prior research (16) , that infected individuals are potentially tested on average 5 days after the onset of symptoms. On any given day, individuals with more severe symptoms get priority for testing. Following (17) , this prioritization exerts a selection effect that results in different average severity of COVID in the tested vs. untested infected populations. We also assume a 30% false negative rate for PCR-based COVID tests (4, 5) . A positive test results in reduced contact rates for tested individuals, changing the transmission rates endogenously. Thus the model tracks various mechanisms generating the observable measures: a daily reported infection rate that combines the total positive results from each day's tests with daily post-mortem infection confirmations, and a daily reported death rate of confirmed cases. Unobserved parameters, such as IFR and asymptomatic fraction, can then be estimated by fitting the model against data for observables. We estimate the effective hospital capacity for each country as a function of total hospital beds available, modified for population density. This modification approximates the effect of geographic heterogeneity in demand for hospital capacity -in larger and less densely populated countries, geographic heterogeneity in the progression of the epidemic means that some areas (e.g. New York city) may be overwhelmed with demand even when other parts of the country have unused hospital capacity. As with testing, hospital capacity is allocated between more severe COVID cases (both those tested and those not tested) and demand from non-COVID patients. Hospitalization can reduce the mortality rate for COVID patients, while other determinants of fatality rates are the severity of symptoms and the age distribution of the population (8, 9) . Each country also exhibits a response to the perceived risk of COVID. The response is endogenously driven by a weighted combination of reported and actual hazard of infection. Perceived risk drives reduction in contact rate and hence transmission rates through non-pharmaceutical interventions such as social distancing measures. Specific government policies are not explicitly represented, but treated as part of this endogenous risk response. Different countries could vary in their risk perception, response time, as well as the magnitude of their response based on the perceived risk. Overall each country is represented by a system of ordinary differential equations tracking seven population stocks (Susceptible, Pre-symptomatic Infected, Infected Pre-testing, and four groups of Infected post-testing -positively tested or not × hospitalized or not) and a state variable for perceived risk. The model also includes multiple additional state variables for measurement and accounting of e.g. deaths and recoveries. Model parameters are specified based on prior literature and formal estimation. Main parameters adopted from the literature include incubation period (5 days (8, 16) ), onset-to-detection delay (5 days (16) ), sensitivity of PCR-based testing (70% (4, 5) ), and post-detection illness duration (10 days (8) ). We estimate the remaining parameters using a maximum likelihood approach. Using testing rate time series and various country-level data points (e.g. population, hospital capacity, comorbidities, age distribution), the model endogenously simulates confirmed new daily cases and deaths over time and matches them against observed data by maximizing the likelihood of observing those data given the model parameters. We use a negative binomial likelihood function to accommodate excess dispersion in infection and death flows compared to more common Gaussian specifications. Where data on mean) are 2.2% (9.4% for median across countries, Med) and 8.4% (Med: 10.2%) for cumulative infections and deaths, respectively. R-square values exceed 0.9 for 56% of 336 country-specific cumulative and incidence time series. Our projections are also consistent with excess mortality data, where available (see appendix S4). Main findings include estimated parameters of COVID-19 epidemics across countries, as well as trajectories of key measures. For estimated parameters we report a) the mean of most likely values across countries (Mean); b) the standard deviation of those values (Std), which quantifies variability of the underlying concept across countries; and c) the mean of country-level interquartile ranges (MIQR), which captures the inherent uncertainty in parameter estimates for each country. The model brings together infection, testing, risk perception, behavioral and policy responses, hospitalization, and fatality using data from across the globe. Here we focus on three main results. First, the magnitude of epidemic is widely under-reported with much variation globally. We estimate the asymptomatic fraction to be 55% (Mean; Std: 0.6%; MIQR: 3.7%). Combined with a sensitivity of 70% (4, 5) for PCR tests, the fraction of infections that could potentially be identified without mass testing is limited to 32%. Most countries do not reach this upper bound, however, and testing is the binding constraint. Figure 2 -A reports the ratio of estimated to reported infections and deaths, ranging between 2.4 and 157 for infections and 0.65 to 27 for deaths. In our sample of 84 countries (total population of 4.675 billion), we estimate total infections at 88.5 (85-95.3, 95% Credible Interval (CI)) million and 600 (CI: 586-622) thousands by June 18 2020, 11.8 and 1.48 times larger than reported numbers respectively. Under-counting has been larger earlier in the epidemic. For example, USA faced peak infection rate (of 395 thousands per day) when detection rate was less than 10% of daily new cases. The most affected countries (based on percent infected) to date include Ecuador (18%), Peru (16.6%), Chile (15.5%), Mexico (8.8%), Iran (7.9%), Qatar (7.3%), Spain (7.1%), USA (5.3%), UK (5.2%), and Netherlands (4.8%) (See Figure 1-B) . Despite the order-of-magnitude under-reporting of cases, however, herd immunity due to infection remains far from reach (Figure 2-B) . The closest a A B . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint country has come to herd immunity is Chile, which at its peak infection rate was 194 (CI: days away from infection of 80% of its total population. Second, we quantify both significant heterogeneity, and notable consistency, across countries in terms of basic parameters of the epidemic. The estimated initial reproduction number RE ranges between 1.24 (Indonesia) and 10.28 (Iran), with the median of 4.5 (Iceland) (RE estimates include significant uncertainty; See Figure S6 for details). Even absent endogenous reductions in contact due to realized risk, these initial rates may be higher than longer-term levels for most countries because of early exposures happening in denser social networks. Cross-country variations also reflect differences in population density, cultural practices, hygiene, as well as the timing of infection and thus early preparedness. We find support for the impact of weather on reproduction number, with the multiplier (RW) developed by Xu and colleagues (7) explaining transmission rates closely (RW 0.76 ). We estimate the global infection fatality rate (IFR) to have been 0.68% (CI: 0.64%-0.70%), with a wide range (Figure 2 -C) between 0.14% (CI: 0.12-0.16; Qatar) and 2.08% (CI: 1.73-2.28; Italy). The variations in IFR are primarily due to demographics with elderly having much higher risks (9) . Hospital availability and treatment effectiveness provide a second mechanism affecting fatality rates. We find that hospitalization can bring down risk of death to 39.5% (Std: 17.8%; MIQR: 7.9%) of baseline. The model explains much variability in IFR with minimum country-level variability in basic fatality parameters. Nevertheless, the quality of fit deteriorates in a handful of countries. For example, a ratio of simulated to reported deaths below one in a handful of countries signals that the model expected fewer deaths than observed. Notably, we find it hard to reconcile, using only the model's mechanisms, the low fatality rates in Qatar, Singapore, and Thailand relative to infection statistics, and high fatality rates in Belgium and France. Third, testing regulates transmission dynamics though a few mechanisms. By providing early warning, testing sets in motion the behavioral and policy responses essential to reducing transmission rate and controlling the epidemic. The exponential nature of contagion amplifies small early differences, such that a few days' difference in response time can lead to large differences in peak infection and final epidemic size. Behavioral and policy responses to changes in infection risk are adopted rapidly, on average reducing effective contact (and thus transmission) rates in 13.5 (std: 7.1; MIQR: 3.8) days. Those responses are relaxed when risks attenuate, though down-regulating risk perception is slower (mean: 70.0; std: 31.2; MIQR: 49.2 days). Given limited data on rebounds in infections to date, how quickly perceived risk is downgraded is a major source of uncertainty in projecting the future of the pandemic. We also find that those with a positive test reduce their infectious contacts to a fraction of the original level (Mean: 11.8% of original contact rate; Std: 2.3%; MIQR: 11.0%). These reductions are especially important because asymptomatic individuals are estimated to be less infectious than symptomatic ones (Mean: 28%; Std: 0.68%; MIQR: 2.4%). Finally, by regulating infection rates through the previous two channels, testing also 'flattens the curve', allowing a larger fraction of those in need of medical care to be hospitalized. Together, these mechanisms create an early race between testing and the spread of infection ( Figure 3 ). When infection is ahead, detection is compromised (moving up into the darker shades in figure) , responses lag further, and exponential infection growth can overwhelm hospital capacity. High early test capacity and its rapid expansion (moving to the right in the figure), brings the epidemic to light, activates policy responses, and allow for control. 4 (Panels A and B) shows one alternative testing scenario and counter-factual trajectories of pandemic to date. In this scenario, all countries shift their testing rates from baseline to 0.1% of population per day (a rate currently achieved by multiple countries, e.g. USA, Australia, in Figure 3 ). The shift happens on March 11, 2020 (the date COVID-19 was officially designated as a pandemic). Such enhanced testing would have reduced total cases from 88.5 millions to 53.2 (CI: 50.5-59.6) millions. Corresponding reduction in deaths would have been from 600k to 403k(CI: 377k-438k). By making additional assumptions on future testing and responses, the model can inform future trajectories. We explore a few projections out to spring 2021 that exclude vaccine and treatment availability. Figure 4 (panels C and D) shows projections under three scenarios: I) using the current country-specific testing rates and response functions moving forward; II) if enhanced testing (of 0.1% per day) is adopted on July 1 st 2020p; and III) if sensitivity of contact rate to perceived risk is set to 8 (approx. 70 th percentile of estimated value across countries), leaving testing at current levels. Contrary to the impacts of early differences in testing ( Figure 4-A) , the reductions in future cases from additional testing (II) are rather modest (from 1.55 in I to 1.37 billions in II), because recognition of epidemic is no longer the bottleneck to response. Both these scenarios project a very large burden of new cases in the fall 2020, with hundreds of millions of cases concentrated in a few countries estimated A B C D . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06. 24.20139451 doi: medRxiv preprint to have insufficient responses given perceived risks (primarily India, but also Bangladesh, Pakistan, and USA). In contrast, changes in response policies would make a major difference. Scenario III brings down future cases sharply, to as low as 153 (CI: 147-170) millions (cumulatively) by the end of winter. Across scenarios, we find that the infection rate in each country usually peaks and then stabilizes at a lower and slowly declining linear rate, in some cases after damped oscillations (i.e. second waves). These (approximately) steady-state infection rates are at levels that motivate just enough policy and behavioral response to bring effective reproduction number RE to 1. Faster rates lead to exponential growth, raise alarms and bring down contacts; slower ones lead to relaxation of policies bringing the reproduction number back up to 1. Those post-peak infection rates vary widely across countries and depend on risk perception parameters, time constants, and contact sensitivity to risk, as well as weather and susceptible population size. In this scenario we shift both testing and contact sensitivity to perceived risks to arguably more realistic values between current ones and those in scenarios II and III in Figure 4 -C. Specifically, on July 1 st 2020 we shift test rates to a value 20% between the current rates and 0.1% per day (which increases testing in most countries and reduces it in a few). Similarly, sensitivity of contacts to perceived risk is shifted to 20% of the way between estimated country-level values and a high value of 8. Resulting infection and death rates at the end of this period reflect the post-peak burden of the disease discussed above and are estimated at 0.01% (CI: 0.005-0.045) and 7.9e-5% (CI: 4.4e . Appendix S4 provides projections for all countries. Note that projections are highly sensitive to assumed testing, behavioral, and policy responses in the scenario. As such they should be interpreted as indicators of potential risk and not precise predictions of future cases; more rigorous testing and reductions in contacts in response to risk perception will significantly reduce future cases while laxer response and normalization of risks can lead to overwhelming breakouts. To enable empirical estimation and considering limits to data availability at a global scale, the model includes important simplifications and uncertainties that should temper the interpretation of the results. First, by analyzing an aggregate, country-level model, we abstract away much heterogeneity, from social networks to super-spreaders and local events. Moreover, we offer no explanation for the underlying heterogeneity in reproduction numbers across countries. Second, lacking hospitalization data across the world, the model's hospital sector includes additional uncertainty. Similarly, our estimated behavioral and policy response functions do not tease apart the impact of various determinants such as national and local policies, business and event closures, use of personal protective equipment, reductions in physical interactions, and a host of other contributors. Thus, results are not informative about the effectiveness of specific policies, and extrapolation of response function in future scenarios includes unknown uncertainties. For example, if normalization of risk reduced response magnitude in the second wave, our projections would miss that and prove optimistic. Finally, absent explicit travel network our results would under-estimate the risk of the reintroduction of the disease in locations that have contained the epidemic. This paper provides a systemic view of the COVID-19 pandemic globally. By incorporating testing capacity and prioritization, hospitalization, and risk perception and responses into an epidemiological model, we are able to closely match widely varying country-specific trajectories of the epidemic. Our model explains the observed heterogeneities primarily through three pathways. First, estimates point to much variance in initial reproduction numbers across countries. This heterogeneity is not surprising given that reproduction number is very much a function of human behaviors and interaction is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . environments. Nevertheless, our study was not designed to tease out the sources of that heterogeneity, an important task that requires other types of data and analysis. Second, the course of the epidemic is also significantly impacted by risk perception and behavioral and policy responses that vary notably across different countries. Finally, we show that differences in testing rates play an important role in shaping those trajectories. Despite those variations, we also find much that is consistent across the globe: a) We estimate that just over half of the infections are asymptomatic, consistent with detailed estimates from smaller samples (1, 13, 14) . b) We estimate that asymptomatic individuals are about a third as infective as symptomatic patients. c) In our model, hospitalization brings down fatality substantially, highlighting the crucial importance of keeping cases below hospital capacity. d) We find an average global infection fatality rate close to 0.68%. This is consistent with growing evidence on IFR (11, 28) . Our estimates for IFR change predictably with age, and require few other predictors and no country-specific factors to stay consistent with the data. e) An order of magnitude under-reporting of cases is the norm across most countries, with a few percent of population already infected across many nations; nevertheless, herd immunity remains distant. f) Finally, absent notable improvements in country level responses or breakthroughs in vaccination or treatment, the outlook for the epidemic remains grim, with most nations settling into a steady state of cases and deaths that, while below their peaks, are troublingly large. 20 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. The model simulates the evolution of COVID-19 epidemic, risk perception and response, testing, hospitalization, and fatality at the level of a country. Here we explain key equations and structures in each sector, followed by complete listing of model equations and parameters in S5. The model is a derivative of the well-known SEIR (Susceptible, Exposed, Infectious, Recovered) framework for simulating infection dynamics. Figure S5 provides an overview of key population groups and the population movements among them 1 . . We assume demand for testing and hospitalization are driven by symptoms, so all asymptomatic patients will be in the latter category. 1 In the equations below we use short-hands to simplify mathematical notations. The full model documentation uses full variable names. Table S1 provides the mapping between the short-hands and the full names, as well as the sources and equations for those variables and parameters discussed below. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint From these Infectious categories, resolution flows (r…) take individuals to either Recovered (R…) or Dead (D…) states, with corresponding subscripts U, C, CH, and UH for stocks and UU, UHCH etc. for flows. Given the differences in severity and potential survival extension due to hospitalization, we distinguish between resolution delay for those in hospital (Hospitalized Resolution Time; τH) and those not hospitalized (Post-Detection Phase Resolution Time; τR). We use first order exponential delays for all lags, though sensitivity analyses showed very little impact of using higher order delays. The Infection Rate (rSP) controls the flow from S to P and depends on Infectious Contacts (CI), fraction of total Population (N) that is susceptible, and Weather Effect on Transmission (W). The latter is a function of RW, the country-level projections for impact of weather on COVID-19 transmission risk year-round developed by Xu and colleagues (7) and a parameter, Sensitivity to Weather (sW), to be estimated: Infectious contacts depend on the Reference Force of Infection (β), various infectious sub-populations (and their relative transmission rates; ma for asymptomatic and mT for confirmed), and Contacts Relative to Normal (FC), which captures behavioral and policy responses as a fractional multiplier to baseline infectious contacts: In this equation we separate various stocks (of I and P) into asymptomatic (a superscript) and symptomatic (s superscript). That distinction is treated analytically using a zero-inflated Poisson distribution that is discussed in the next section. In light of evidence on the short serial interval for COVID-19, likely below the incubation period (29, 30), we do not distinguish the infectivity of pre-symptomatic individuals from those post onset. Contagion dynamics start from Patient Zero Arrival Time, T0, another estimated parameter. The key mechanisms regulating the population flows among these stocks are discussed below, and a schematic of important relationships is provided in Figure S6 . Five parameters are estimated in the equations discussed above. One of them (sW) is global (i.e. assumed identical across countries; see the estimation section below for details on the distinction between global and country-specific parameters) and the remaining four are country-specific: β, mT, ma, and T0. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. COVID-19 infection varies in acuity, from asymptomatic to life-threatening. Acuity of disease affects notonly baseline fatality risk, but also testing and hospitalization decisions which impact official records of infection and fatality rates. Since movement between population groups via testing or hospitalization is itself a function of acuity, to allow for consistent inference of mean acuity across different population groups, we use an analytical framework to track acuity levels. This framework, which we adapted from prior research (17) , obviates the need to disaggregate the population by different acuity levels (which would prohibitively raise the computational costs for estimation). Specifically, we represent acuity using a zero-inflated Poisson distribution. This distribution combines two subpopulations -one with Poisson-distributed acuity levels with mean Covid Acuity (αC), and another Additional Asymptomatic Fraction with zero acuity, which is the zero-inflated component. The sum of those with zero acuity from the Poisson part of the population and the second group is the Total Asymptomatic Fraction (a). We assume this asymptomatic group is not given priority in testing or hospitalization, and is not at risk of death. Thus they will always follow the → → → → pathway. The pathways for the remaining population depend on acuity and its impacts on testing, hospitalization, and death. Note that the concept of acuity defined here only needs to have a monotonic relationship with tangible symptoms and risk factors and it does not have a one-to-one relationship with any real world measure of acuity, and as such is better seen as a mathematical construct that informs modeling rather than a real-world variable with clinical definition. From this framework two parameters, a and αC, are estimated as country specific parameters with limited variability across countries. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint The testing sector reads the Active Test Rate (Tt) for each country as exogenous input data (see appendix S3 for pre-processing details for this data). A fraction of this total test rate, typically small, is allocated to postmortem testing of COVID-19 victims who have not been previously confirmed (Post Mortem Tests Total, TPM). Specifically, of the deaths of unconfirmed infectious individuals (whether hospitalized or not), a certain Fraction of Fatalities Screened Post Mortem (nPM) will be identified true post-mortem tests. We anchor the nPM to Fraction Covid Death In Hospitals Previously Tested (nDCH). The rationale for this anchoring is that on the margin if there are many unidentified COVID patients in hospitals, the chances are that the system lacks enough testing capacity and thus post-mortem testing should also be less thorough: We experimented other functional forms with a free parameter connecting the two constructs, but following our conservative estimation principle decided against including that free parameter in the final model. We feared that absent clear observables to identify this additional parameter (e.g. on country-specific policies regulating post-mortem testing) the degree of freedom would improve the fit but potentially for the wrong reason. The Where the Multiplier Recent Infections to Test (mIT), captures the sensitivity of negative test demand to recent infection reports. To allocate the available tests (TNet) between these two sources of demand, we use an analytical logic that allocates testing based on acuity of symptoms. The basic idea is that through self selection and screening by testing centers, people who have more symptoms or other signals that correlate with COVID infection (e.g. high exposure risk) are more likely to be tested. We assume each unit of acuity increases the likelihood that an individual gets tested, based on a variable Prob Missing Symptom, pMS. This variable represents the probability that each acuity unit fails to convince the testing decision process to test a given individual, i.e. how selectively and sparingly tests are conducted. Specifically, in this model an individual with k acuity units is tested with probability: We assume the negative test demand is coming from a population with a Poisson-distributed, unit average acuity level (αN=1) for symptoms of non-COVID influenza-like illnesses. The test demand from COVID . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint patients also comes from a Poisson distribution of acuity, but with mean αC. With the Poisson distribution and given a level of α and pMS, one can calculate the fraction of each demand source that would be tested: We therefore need to find the pMS that allows test supply to match demand that is satisfied, specifically, by solving the following equation for pMS*: Having solved for p*MS (numerically), we analytically calculate the average acuity level for those positively tested (αCP: Average Acuity of Positively Tested ) and those either not tested or having received a false negative result (αCN). Specifically, if test sensitivity was 100%, the average acuity for those not tested would be: The acuity level for those tested could then be found based on the conservation of total acuity across those positively tested and those not. Starting with this basic specification we further account for the Sensitivity of Covid Test (sT) to calculate the values of αCP and αCN. We parametrize sensitivity at 70%, which is the estimated sensitivity for the PCR-based tests used as the primary diagnosis method of current infections of COVID-19 (4, 5) . Overall, the testing rates that are determined by solving for * , combined with sensitivity of tests, inform the fraction of COVID positive individuals transitioning from pre-detection (IP) to confirmed vs. unconfirmed states (IC or ICH vs. IU or IUH), while the calculated α values inform the likelihood of hospitalization and fatality rates, as discussed next. The testing sector includes the following two country level parameters that are estimated: nST, mIT. The hospitalization sector of the model starts with each country's Nominal Hospital Capacity (hN) in total hospital beds. In practice, geographic variation in hospital density and demand creates imperfect matching of available beds with cases of COVID-19 at any point in time, e.g. because some potential capacity is physically distant from current COVID hotspots. This imperfect matching means some of the nominal hospital capacity is effectively unavailable at any time, especially in larger, less densely populated countries. We therefore calculate Effective Hospital Capacity (hE) by considering geographic density of hospital beds (Bed per Square Kilometer; dH): Where the * represents a large Reference Hospital Density of 6.06 beds per km 2 (which is the value of for South Korea). The parameter sDH (Impact of Population Density on Hospital Availability) is estimated. Effective capacity is allocated between Potential Hospital Demand (HCD) from COVID-19 cases and the regular demand for hospital beds from all other conditions (which we assume equals pre-pandemic effective hospital capacity). We assume that COVID-19 patients will have higher priority for hospitalization compared to regular demand. Specifically, we assume that fraction of regular demand allocated (mHR) would be the . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint square of that for COVID demand (mHC), = 2 , and solve the resulting hospital capacity allocation problem analytically: We determine the COVID demand for hospitalization based on a screening process similar to that for testing. Two types of COVID patients may seek hospitalization: those with confirmed test results and those without. The former are more likely to seek hospital treatment. We first calculate a parameter analogous to pMS in the testing sector that informs the demand from confirmed COVID patients for hospitalization. This parameter, the PMAS Confirmed for Hospital Demand (pMHC) is determined based on acuity level of confirmed (αCT) and Reference COVID Hospitalization Fraction Confirmed (rH), an estimated parameter capturing the overall need for hospitalization among COVID patients: For unconfirmed COVID patients we scale the analogue of this parameter (pMHU) based on how much priority non-COVID patients generally receive: This formulation ensures that: 1) Confirmed COVID patients are more likely to be hospitalized, but also that 2) if there is ample hospital capacity (mHR~1), then confirmed and unconfirmed COVID patients will receive similar priority for the same level of acuity. In short, the pM. values determine hospital demand by confirmed and unconfirmed COVID patients, which add up to HCD. The latter determines the fraction of hospital demand that is met. Analogous to the testing sector, this fraction along with demand determines the flow of individuals from the pre-detection (IP) state to hospitalized vs. non-hospitalized states (ICH or IUH vs. IC or IU). Matching demand to allocated capacity also allows us to calculate the realized Probability of Missing Acuity Signal at Hospitals (p*M) for confirmed and unconfirmed patients. As in the testing sector, those probabilities let us approximate for the expected acuity levels for COVID patients in and out of hospital, as well as tested vs. not-tested, i.e. αCT, αCH, αU, and αUH. These average acuity levels in turn inform fatality rates for each group. The hospital sector includes two country level estimated parameter with limited variation across countries: sDH and rH. For patients in each of the U, C, CH, and UH groups we specify the Infection Fatality Rate (f), as: The parameter Base Fatality Rate for Unit Acuity (fb) sets the baseline for fatality rate. Sensitivity of Fatality Rate to Acuity (sf) determines how fatality changes with estimated acuity levels; more severe cases are expected to have higher fatality rates. Hospitalization reduces fatality rates, expressed as the relative Impact of Treatment on Fatality Rate (sHF). The function incorporates the impact of age distribution on fatality rates. For age effect, we calculate a risk factor for each country based on its age distribution and the relative fatality risks for COVID patients in different ages documented in prior work (9) . We normalize this factor . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint against its value for China where the original study was conducted. Given the well-established impact of age on fatality, this factor is directly multiplied into the infection fatality equations. Overall, the fatality sector includes three parameters that are estimated at the country level, with limited variance across countries, those are: , , and . Note on comorbidities and fatality: We also explored including three comorbidities but found the estimates unreliable and therefore they are not included in the main specification of the model. Those comorbidities include obesity, chronic disease, and liver disease. The effects we explored for each were: (.) = (.) (.) , where we used the following country-level indicators from the World Health Organization (23), normalized by the average across all countries (d(.)): For obesity: Prevalence of obesity among adults, BMI ≥30 (age-standardized estimate) (%) For chronic health issues: Probability (%) of dying between age 30 and exact age 70 from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease For liver disease: Liver cirrhosis, age-standardized death rates (15+), per 100,000 population In equation 3 we noted that Contacts Relative to Normal (FC) regulates infection rates. This factor ranges between a minimum (Min Contact Fraction; cMin) and 1 as a function of the relative Utility from Normal Activities (UN) compared to Utility from Limited Activities (UL) in light of additional risks associated with normal activity patterns: The utilities from normal and limited activities are specified as . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Overall, the risk perception and response sector includes the following five country-specific parameters that are estimated: cMin, τRU, τRD, λ, and wR. Table S1 summarizes the main equations discussed in appendix S1, providing the mapping between full variable names and the short forms. It also includes all estimated model parameters, as well as those specified based on prior research. Table S1 -Mapping between full variable names and their short form for the subset of variables and parameters discussed in S1. Also included are equations explained above and sources for other variables. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint The model we estimate is nonlinear and complex, and any estimation framework is unlikely to have clean analytical solutions or provable bounds on errors and biases. Therefore, in designing our estimation procedure, we set out to be conservative. Specifically: we use a likelihood function that accommodates overdispersion and autocorrelation (negative binomial); we utilize a hierarchical Bayesian framework to couple parameter estimates across different countries which reduces the risk of over-fitting the data; and we use the conceptual definitions of parameters and their expected similarity across countries to inform the magnitude of that coupling across countries. Compared to more common choices in similar estimation settings, these choices tend to widen the credible regions for our estimates and reduce the quality of the fit between model and data. In return, we think the results may be more reliable for projection, more informative about the underlying processes, and better reflective of uncertainties in such complex estimation settings. The model is a deterministic system of ordinary differential equations with a set of known and unknown parameters. The known parameters are those specified based on the existing literature and do not play an active role in estimation. The unknown parameters can be categorized into those that vary across different countries and those that are the same across all countries (i.e. general parameters). The estimation method is designed to identify both the most likely value and the credible regions for the unknown parameters, given the data on reported cases and deaths (and for a subset of countries, the excess deaths). This is done through a combination of estimating the most likely parameter values in a likelihood based framework, and using Markov Chain Monte Carlo simulations to quantify the uncertainties in parameters and projections. We first introduce the 3 different components of the likelihood function we use: the fit to time series data, the random effects component coupling country-level parameters, and the penalty for excess mortality. Then we explain the implementation details. Define model calculations for expected reported cases and deaths for country i as μij(t) (with index j specifying cases and deaths) and the observed data for those variables as yij(t); the country-level vector of unknown parameters as and the general unknown parameters as ϕ. Note that vector includes several parameters, each specifying an unknown model parameter, such as Impact of Treatment on Fatality, or Total Asymptomatic Fraction, for country i. The model can be summarized as a function f that produces predictions for expected cases and deaths for each country given the general and country-specific parameters: We use a negative binomial distribution to specify the likelihood of observing the y values given θ and ϕ. Specifically, the logarithm of likelihood for observing the data series y given model predictions μ(θ ,ϕ) is: where (dropping time index for clarity): . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. Summing the LT function over time provides the full (log) likelihood for the observed data given a parameterization of the model. The negative binomial likelihood function includes two parameters, μ and ε which determine the mean and the scaling/shape of the observed outcomes. The second parameter, ε, provides the flexibility needed fit outcomes with fat tails and auto-correlation. This parameter could itself be subject to search in the optimization process. Specifically, we assume that: Thus we create a (set of) country specific parameter(s) ( ) and two general parameters ( ) which should be estimated along with the conceptual model parameters. The country level scale ( ) implicitly assesses the reliability and inherent variability in country level reports, and the general ones inform the variability in case data vs. deaths. We augment the vectors ϕ and θ to include these scaling parameters as well. Up to this point we have not included any relationship among country specific parameters, . This independence assumption would allow parameters representing the same underlying concept to vary widely across different countries. Such treatment, by providing more flexibility, enhances the model's fit to historical data. However, it ignores the conceptual link that exists for a given parameter across countries, potentially allowing the model to fit the data for the wrong reasons (i.e. using parameter values that do not correspond to meaningful real world concepts). The result would likely be less reliable and also not robust for future projections. We therefore define a Hierarchical Bayesian framework to account for the potential dependencies among model parameters. Specifically, we assume the same conceptual parameters (e.g. Impact of Treatment on Fatality), across different countries, are coming from an underlying normal distribution with an unknown mean (to be estimated) and a pre-specified standard deviation. This assumption is similar to the use of "Random Effect" models common in regression frameworks, though we deviate from canonical random effect models by pre-specifying the standard deviation. In fact it is possible to estimate the standard deviation across countries as well (and to obtain better fits to data), but adding those degrees of freedom ignores qualitatively relevant insights about the level of coupling across different countries for each parameter, and thus results may fit the data better but for the wrong reasons. For example, some parameters, such as Patient Zero Arrival Time, could be very different across countries, whereas parameters reflecting innate properties of the SARS-CoV-2 virus itself (e.g. Total Asymptomatic Fraction (a)) or those determining fatality (e.g. Base Fatality Rate for Unit Acuity (fb)) should be very similar across different countries. Allowing the model to determine the variance for the latter will lead to better fits: the model can find baseline fatality rates that easily match fatality variations across countries, and would expand the corresponding variance parameter accordingly. However, as a result the estimation algorithm will have too easy a job: it will not require a precise balancing between hospitalization, impact of acuity on fatality, and post-mortem testing decisions to fit fatality data. Thus, the estimates may well be less informative, or further from true underlying processes and the general characteristics of the disease which we care about. The implementation of this random effect introduces another element to the overall likelihood function: Here θik represents the k th parameter for country i, and ̅ is the (estimated) average across countries for the k th parameter. σk is the pre-specified allowable variability for the k th parameter across different countries. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Specifying these standard deviations adds a subjective element to the estimation process. However, we note that subjective elements are ultimately indispensable in any modeling activity: from specifying the model boundary to the level of aggregation, use of various functional forms, and choice of likelihood functions, these choices are built on subjective assessments that experts bring to a modeling project. Absent our conceptually informed variability factors, we would need to make the assumption that country-level parameters are independent, or that our complex estimation process would correctly identify the true dependencies among those parameters. We think both those alternatives are inferior in the chosen method. So here we focus on transparently documenting and explaining those assumptions. Table S2 summarizes the estimated model parameters, their estimated values (mean across countries and mean of Inter-Quartile Range) and the assumed variability factor (σk) for each. sW Sensitivity to Weather 0.765 (global parameter so the other metrics do not apply) *Given the wide range and potential long tail for these parameters the Log10 transformation is used in specifying the dispersion penalty (equation 27) and variability factors are reported as 10 σ , where σ is used in equation 27. ** These parameters are expected to be less variable across countries and thus are assigned small variability allowances compared to their mean. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Finally, we include a likelihood-based penalty term to allow model predictions be informed by excess mortality data collected by various news agencies and researchers for a subset of countries in our sample. These data provide snapshots of excess mortality (compared to a historical baseline) for a window of time in each country. Subtracting from total excess mortality the COVID-19 deaths officially recorded in that window offers a data point for excess mortality not accounted for in official data (ei). We can calculate in the model the counter-part for this construct: the simulated mortality that is not included in the simulated reported COVID-19 deaths ( ̅ ). There is uncertainty in these excess mortality data: the historical baselines used by various sources do not adjust for demographic change, excess mortality may be due to factors other than COVID-19, and some of it may be due to changes in healthcare availability and utilization motivated by COVID-19 but not directly attributable to the disease (for example when surgeries are delayed, hospitalization is avoided, or heart conditions are ignored). Excess mortality may also be reduced due to reduced traffic accidents (in light of physical distancing policies) and pollution related deaths. Given these uncertainties, we use the following penalty function to keep the simulated unaccounted excess mortality close to data: This penalty could be seen as a likelihood coming from the probability distribution ( ) = should be attributed to COVID-19 deaths, but that there is significant uncertainty around this, so some 20% variation across this figure is quite plausible (70%-110% of data). However, numbers outside of this range start to impose increasingly large penalties, so that very large deviation becomes unlikely. Combining these three components, we obtain the full likelihood function used in the analysis: For each country we include the LT component from the first day they have reached 0.1% of their cumulative cases to-date, or a minimum of 50 cumulative cases. This excludes very early rates that are both unreliable and which, given very small estimated model predictions for infection, could lead to unreasonably large likelihood contributions. The model includes a large number of parameters to be estimated: a general parameter for the impact of weather, 2 general parameters for , and 21 parameters for each country that are coupled together based on the random effects framework described above. Out of those 1 parameter is for and the rest are informing various features of disease transmission, testing, hospitalization, and risk perception and response. With a sample of 84 countries, this would lead to about 1800 parameters to be estimated. A direct optimization approach to this problem suffers from potential risk of getting stuck in local optima, and direct use of MCMC methods to find the promising region of parameter space suffers from the curse of dimensionality. We therefore designed the following 4-step procedure to find more reliable solutions to both problems. 1) We estimate the model with the full parameter vector for a smaller number of countries with larger outbreaks (3-5 countries). We use the Powell direction search method implemented in Vensim™ simulation software for this step. The method is a local search approach though it has features that allows it to escape local optima in some cases. We restart the optimization from various random points in the feasible parameter space and track the convergence of those restarts to unique local peaks. We stop this process when we are repeatedly landing on the same local peaks in the parameter . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint space. This procedure showed that local peaks do exist, but they are not many; for example within 100 restarts we may find 2-4 distinct peaks. This provides a coherent set of starting points for ϕ and ̅ for next steps. 2) We go through iterations of the following two steps: A) Conduct country-specific optimizations with 50 restarts to find the vector of θi given the ϕ and ̅ from first optimization or from the step B. B) Conduct a global optimization, including all countries but fixing θi and optimizing on ϕ (and ̅ ; though that is simply the mean across country level parameters from previous round). We stop when iterations offer little improvement from one round to the next (less than 0.05% improvement in loglikelihood). 3) We conduct a full optimization allowing all parameters (θi, ϕ and ̅ ) to change, starting from the point found in the last iteration of step 2. This step finds the exact peak on the likelihood landscape which is the best-fitting parameter set for the model. 4) For the MCMC, theoretically one should conduct the sampling from all model parameters in the full model. However, our experiments showed that the large dimensionality of the parameter space requires an infeasible number of samples to achieve adequate mixing and ensure reliable credible regions for parameters and projections. To overcome this challenge we note that the parameters of different countries are connected to each other only through ϕ and ̅ , and these general parameters are rather insensitive to dynamics in each country. The insensitivity is due to the fact that a single country only contributes about 1% to the general parameters' values, and within a typical MCMC the country-level parameters often can't change more than 10% before the resulting samples become highly unlikely. Therefore, one can conduct an approximate country-level MCMC by fixing the general parameters at those from step 3, and only sampling from the θi for each country. The MCMC algorithm used is one designed for exploring high dimensional parameter spaces using differential evolution and self-adaptive randomized subspace sampling (19) . Using this method we obtain good mixing and stable outcomes (Robin-Brooks-Gelman PSFR convergence statistic remaining under 1.1) after about 600,000 samples (the burn-in period). We continue the MCMC for each country for 1 million samples and then randomly take a subsample of those points after the burn-in period for the next step. 5) The resulting subsamples for different countries from step 4 are assembled together to create a final sample of parameters for the full model to conduct projections and sensitivity analysis at the global scale. Uncertainties in the handful of global parameters is not identified in this procedure, but can be quantified by assessing the sensitivity of the global likelihood surface to changes in those parameters. The process above is automated using a Python script that controls the simulation software (Vensim). We conduct the analysis using a parallel computing feature of Vensim on a Windows server with 48 cores. After compiling the simulation model into C++ code (which speeds up calculations significantly), and using a simulation time step of 0.25 days, it takes about 20 hours to complete the estimation for 84 countries. Full analysis code is available online at https://github.com/tseyanglim/CovidGlobal. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint S3-DATA PRE-PROCESSING Getting contemporaneous, comprehensive, national-level data on Covid-19 is a challenge. The most widely-cited data aggregators, such as the Johns Hopkins Center for Systems Science and Engineering's COVID-19 database (31) , OurWorldInData portal (3), and the US-focused COVID Tracking Project (21), get their data from the same few official sources, such as the US Centers for Disease Control and Prevention (CDC), the European CDC (ECDC), and the World Health Organization (WHO). These official agencies in turn get their data from national and subnational public health authorities, which ultimately rely on reports from hospitals, clinics, and private and public health labs. As a result, idiosyncrasies in the ground-level data collection processes permeate virtually all sources of aggregate data. Most notably, data collection involves time lags, which can differ from source to source. Daily death counts could reflect the date of actual death or the date a death is registered or reported; different UK government sources, for instance, use each of these metrics. 2 Daily infection or case counts could include the total new cases reported on a given date, or the total cases confirmed from that date; the latter would result in some 'backfill' whereby case counts for previous days can continue to increase for some time as delayed confirmations come in. Daily counts of tests conducted could report samples collected, samples processed, results reported, or a mix of these; the US CDC, for instance, reports a mix of testing by date of sample collection and date of sample delivery to the CDC. 3 Aside from differences in unit of measure (people vs. tests vs. samples), there may be different time lags involved as well. In addition to these idiosyncrasies, testing data in particular is also patchy for many countries, even as testing has become more widespread. The WHO does not report country-by-country testing, nor does the JHU Covid map outside the US. Furthermore, there are sometimes irregular delays in the reporting of test results, which can create occasional unexpected spikes in reported numbers of tests, infections, or both. 4 Depending on the specifics of how daily infection and test counts are reported, there can in some cases be a disjunction between the two. Because confirmed case counts largely depend on positive test results, test and infection counts should be correlated -ceteris paribus, a day with a lot of samples collected for testing should see more confirmed cases attributed to it, while a day with no sample collection should see no cases. But since cases may not be reported by the date of the test, and tests may not be reported by the date of sample collection, officially reported numbers can get out of sync in either direction. This problem is most salient when there are clear weekly cycles in daily rates. In most of the world, particularly western countries, daily test rates are far lower on weekends than during the week. As a result, infection numbers show a clear weekly cyclical component as well. But the weekly cycles in testing and infection numbers for a given country do not always line up. Our model explicitly accounts for the effect of testing on reported infections, but we do not explicitly model the country-level idiosyncrasies of reporting and how they vary between test data and infections. Instead we account for any such lags in pre-processing of the data to align testing and case data. The weekly cycle occurs in many countries' death rate data as well, where it presents a different problem. A weekly cycle in testing is a behaviourally realistic part of the data-generation process, as many labs, clinics, or other testing sites for instance may be closed on weekends. As testing provides the window on the state of confirmed infections, a comparable cycle in confirmed cases is to be expected as well. By linking case confirmations to testing, our model explicitly accounts for this limited visibility on the true state of the epidemic. However, a weekly cycle in death rates almost certainly reflects different limitations of the data-2 https://blog.ons.gov.uk/2020/03/31/counting-deaths-involving-the-coronavirus-covid-19/ 3 https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/previous-testing-in-us.html 4 See e.g. https://www.wcvb.com/article/massachusetts-coronavirus-reporting-delay-due-to-quest-lab-itglitch/32288903# . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint generation process, typically to do with hospital staffing, 5 which we do not explicitly model. As such we need to address any weekly cycle in death rates through data pre-processing as well. To deal with these challenges, we developed a multi-step algorithm to pre-process our data before feeding it into the model for calibration. The algorithm is described below. It was implemented in Python, largely using the Pandas and NumPy packages, and the code is available in full at: https://github.com/tseyanglim/CovidGlobal. The algorithm proceeds country-by-country, following these steps on each country. 1) Examine daily cumulative test data; if data are insufficient (6 or fewer data points), drop country from the dataset. 2) Interpolate any missing daily cumulative test data points using a piecewise cubic Hermite interpolating polynomial (PCHIP) spline. If the first reported infection is before the first reported cumulative test, also extrapolate cumulative tests back to the date of first reported infection. a. Extrapolation to the date of first reported infection is necessary since both in the model and, to a large extent, in reality, reported infections require testing for confirmation. b. PCHIP spline interpolation yields a continuous monotonic function with a continuous first derivative, thus avoiding generating any anomalous rapid change in daily test rate. c. We used the implementation of PCHIP interpolation from the widely used SciPy package for Python. 6 3) Calculate daily test rate as daily cumulative tests less the preceding day's cumulative test total: 4) Examine the original daily cumulative test data to estimate how much of the calculated daily test rate is based on interpolated vs. original data. a. Daily test rates calculated based on mostly original data should be expected to include any weekly cycles or occasional irregularities that would also be reflected in daily infection counts. Conversely, daily test rates calculated from cumulative test counts that are largely interpolated would not be expected to fully reproduce any such cycles or irregularities, since the interpolation produces a relatively smooth function. b. As a rule of thumb, we examine the cumulative test data for the second half of the time from the first test to the latest test. If fewer than half the days in that window have original cumulative test data, we consider the test data to be 'sparse', requiring further processing. 5 It may be argued that there are weekly cycles in large-scale human behaviour that may drive some true weekly cyclicality in the true rates of infection and death, and as such it may be wrong to consider such cycles to be artefacts of the datageneration process. However, we find this unlikely for a few reasons. First, weekly cycles in human interactions, largely driven by the work and school week and weekend, will have been significantly attenuated by widespread adoption of social distancing measures around the world. Second and more importantly, variation in incubation period and time before development of symptoms means that any true cyclicality in the timing of initial infection will be further attenuated in the timing of symptom development. By the same logic, wide variability in the delay from symptom development to death means there should be minimal cyclicality, if any, in the timing of deaths, meaning any such cycles visible in the data are due to measurement and reporting lags. 6 https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.PchipInterpolator.html . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint 5) If the test data are not sparse, account for any potential lag or other reporting delay differences between daily test rate and daily infection rate using a time-shift algorithm to estimate any such lags or delays from the data and shift the test rate time series accordingly. The time-shift algorithm ensures that any weekly cycles present in the daily infection rate data are reflected in the daily test rate data and aligned as best as possible on date, thereby accounting for the fact that model-generated reported infections depends on testing but with no time lag between test and result. a. First, identify the weekly component of the time series of daily infection rate and daily test rate using a seasonal-trend decomposition based on LOESS (STL) procedure. 7 i. STL deconstructs time series data into several components, including a trend and a seasonal component over a specified period (weekly, in this case) as well as a residual. STL is an additive decomposition, and has the advantage of allowing the seasonal component to change over time (rather than being a fixed pattern repeated exactly across the whole time series). ii. We used the STL implementation from the Statsmodels package for Python. 8 b. Shift the time series over a one-week range (from -2 to +4 days of lag between test and infection reporting), calculating the cross-correlation between the weekly seasonal component of the daily infection rate data and the daily test rate data for each time shift. c. Identify the time shift within this range that maximizes the cross-correlation between the infection rate and test rate data, and shift the test rate data accordingly. 6) If the test data are sparse, the spline interpolation will generally cut out some of any weekly cyclicality that may be present. Visual inspection of daily test rates for countries with sparse test data also shows large, irregular spikes in reported tests are not uncommon, without necessarily having concomitant irregular spikes in reported daily infection rates. As such, rather than attempting to eliminate differences in reporting lags through the time-shift algorithm described above, we instead apply a data-smoothing algorithm to both daily test rate and daily infection rate, in order to reduce any cyclicality and irregular spikes. This smoothing allows the calibration of the main model to focus on matching the underlying trends in the data. In all cases, whether daily cumulative test data are sparse or not and whether infection and test rate data are smoothed or not, since weekly cycles in death data are reflective of reporting lags not captured in the model, daily death rate data is smoothed using the same algorithm. 8) The smoothing algorithm used is designed first to conserve the total number of reported cases (tests, infections, or deaths), and second to preserve some degree of variation in the time series, as some noise may be informative and retaining some is important to the calibration of the model. a. Starting from when the time series of daily rate (test or infection) exceeds a specified minimum value (5/day), calculate the rolling mean of the daily rate, using a centred moving window of 11 days. b. Calculate the residual between each day's data point and the rolling mean for that day, and divide by the square root of the rolling mean, to get an adjusted deviation value: i. Dividing by the square root of the rolling mean reflects a heuristic assumption that each daily rate (of infections, deaths, or tests) behaves as a Poisson process (StDev of Pois() =  0.5 ). 7 is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06. 24.20139451 doi: medRxiv preprint ii. The functional result of this adjustment is that both absolute and relative magnitudes of deviations from the rolling mean are given some weight -large relative deviations when absolute values are small (and data are noisier) are not ignored, but neither do they outweigh larger absolute (but smaller relative) deviations that occur when the mean is large, which is important since most of the time series data are growing significantly over the time horizon of the model. c. Calculate thresholds for identifying dips and peaks in the data based on the median of the adjusted deviations, ± one median absolute deviation (MAD) of the adjusted deviations: i. Using the median absolute deviation to determine thresholds for peaks and dips is robust to outliers in the deviations, which do arise occasionally in the data. ii. A threshold width of one MAD is relatively narrow for outlier detection, but by inspection of the data, is about right for identifying most of the peaks and dips caused by weekly cycles in test, infection, and death rates, as well as larger outliers. d. Once thresholds are calculated, iterate through the data points in the time series first forward in time from oldest to newest, filling in any 'dips' (data points with adjusted deviations below the lower threshold), then backward in time from newest to oldest, smoothing out any 'peaks' (data points with adjusted deviations above the upper threshold) that remain. Repeat the process until all data points' adjusted deviations are within the originally calculated thresholds for the time series. i. We infer that the underlying processes generating dips and peaks are somewhat different. Dips are generally the result of weekly cycles in the data, e.g. lower rates of testing or longer lags in death reporting that occur on weekends. Peaks arise to some extent due to the same weekly processes, e.g. some deaths that occur on weekends only being recorded at the start of the next week. However, some peaks, especially larger ones, may result from irregular random delays in reporting, such as large batches of tests being held up due to logistical issues and then getting processed all at once. As such the smoothing procedure for dips vs. peaks is slightly different. e. The dip-filling step fills a fraction of each dip (specified as a smoothing factor) by redistributing data counts based on a multinomial draw from the subsequent few days following each dip. i. First, calculate the amount to fill based on the deviation and the smoothing factor specified, in this case 0.67: ii. Calculate the amount redistributed from each of the following few (7) days +1 , +2 , … + , = 7, based on a multinomial distribution as follows: Where +1 , +2 , … +7 are calculated as: iii. This formulation allows some redistribution from any of the subsequent few days whose adjusted deviations exceed the focal day's adjusted deviation, but with more redistribution from days with higher adjusted deviations. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint f. The peak-smoothing step similarly redistributes a fraction of each peak, specified by the smoothing factor, to the preceding several days based on another multinomial draw. i. First, calculate the amount to redistribute similarly to the dip-filling step: ii. Calculate the amount redistributed to each of the preceding several (14) iii. This formulation redistributes peaks to preceding days based on the calculated rolling mean counts of those days, on the assumption that the irregular delays that generate random spikes in counts are essentially random and equally likely to affect any given unit of data over a several-day span. As such, the probability that a unit showing up in a spike due to such delays comes from a given preceding day is simply proportional to the expected count for that day, as approximated by the rolling mean. g. By filling dips first before smoothing peaks, the combined algorithm largely addresses any peaks that are due primarily to weekly cycles during the dip-filling stage, such that remaining peaks that get smoothed tend to be the larger, irregular ones. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Table S3 reports two quality of fit metrics for different countries and different time series. The first four columns report Mean Absolute Error Normalized by Mean (MAEN) and the last four report the R-Squared measures. Errors for cumulative infection and deaths are followed by those for the new cases and deaths (flow variables). . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Figure S7 shows the visualization of fit between data and simulations for all the countries in our sample. These graphs include data and model outputs for reported new cases (blue; left axis in thousands per day) and deaths (red; right axis in thousands per day) starting from the beginning of the epidemic in each country until June 20. Figure S7 -Comparison of data and simulation. New cases in blue (left axis, in thousands per day) and new deaths (red, right axis). . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Estimates for true cumulative cases (blue; left axis in millions) and deaths (red; right axis in thousands) across different countries up to June 20 are reported in Figure S8 . Figure S9 shows the ratio of estimated excess deaths, i.e. COVID-19 fatalities not reported as such, to reported excess deaths, i.e. deaths over historical baseline not accounted for by reported COVID-19 deaths, for the countries for which such data are available. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Figure S9 -Ratio of estimated excess deaths to reported excess deaths. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Figure S10 shows the initial reproduction number (RE) occurring in each country. Reproduction numbers are changing dynamically and transient dynamics may lead to larger than equilibrium numbers if maximum RE values were used. We therefore use the 90 th percentile of simulated reproduction number in this graph. Also note the large credible intervals for these estimates, partly coming from changes in what is included in the 90 th percentile (sometimes it includes the highest values and sometimes it does not), as well as the inherent uncertainty when both reproduction number and behavioral and policy responses are estimated: one can have smaller initial RE and smaller response functions, or larger values for both, and stay consistent with the data. Figure S10 -Maximum reproduction number RE for each country's outbreak . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Parameter estimates Figure S11 reports most likely estimates for the vector of country-specific parameters (θi). The figure also includes 95% credible intervals for these parameter estimates. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Figure S11 -Parameter estimates and 95% credible regions for country-specific parameters. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Figure S12 reports country-level projections for new cases and deaths based on the following assumptions (consistent with those reported in Figure 2 -D in the main paper). We shift both testing and contact sensitivity to perceived risks to arguably more realistic values between current ones that capture some improvements but not a completely different behavior. Specifically, on July 1 st 2020 we shift test rate to a value 20% between the current rates and 0.1% per day (which increases testing in most countries and reduces it in a few). Similarly, sensitivity of contacts to perceived risk is shifted to 20% of the way between estimated country-level values and a high value of 8 that is. Figure S12 -Country level projections until Spring 2021. Daily cases (in thousands, blue, left axis) and daily deaths (red, right axis) are graphed. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 26, 2020. . https://doi.org/10.1101/2020.06.24.20139451 doi: medRxiv preprint Spread of SARS-CoV-2 in the Icelandic population Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (COVID-19): A multi-center study in Wenzhou city Detection of SARS-CoV-2 in different types of clinical specimens Sensitivity of chest CT for COVID-19: comparison to RT-PCR Stability issues of RT-PCR testing of SARS-CoV-2 for hospitalized patients clinically diagnosed with COVID-19 Weather Conditions and COVID-19 Transmission: Estimates and Projections. medRxiv Clinical Characteristics of Coronavirus Disease 2019 in China Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention COVID-19 exacerbating inequalities in the US Suppression of COVID-19 outbreak in the municipality of Vo Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship Universal Screening for SARS-CoV-2 in Women Admitted for Delivery Incubation Period and Other Epidemiological Characteristics of 2019 Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data Modeling the rework cycle: capturing multiple defects per task Data analysis using regression and multilevel/hierarchical models Accelerating Markov chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling Most equations are self-explanatory. The "[…]" notation is used to subscript variables over a set of members. For example, the subscript "Rgn" is used to identify different countries. Therefore [Rgn] indicates that a variable is defined separately for each member of the set "Rgn". Other subscript ranges used in the equations are: expnt: Used for numerically solving the probability of missing symptoms equation. pdim: Used for setting policy levels for a few variables. Priors: Used for implementing the random effects estimation components. Each estimated parameter is mapped into an element of this subscript to simplify vector-based calculations TstSts: The test status including those confirmed ('Tested') and those unconfirmed Complete equations and units Potential Test Demand from Susceptible Population[Rgn] , Positive Candidates Interested in Testing Poisson Subset Adj Active Test Rate[Rgn] = if then else ( Time < New Testing Time , DataTestRate[Rgn] , External Test Rate INITIAL( InputAve[PriorEndoAve] * ( 1 -SW EndoAve ) + SW EndoAve * CalcAve Activities Allowed by Government Additional Asymptomatic Post Detection[Rgn] = Weighted Infected Post Detection Gate[Rgn] * Additional Asymptomatic Relative to Symptomatic[Rgn] / ( 1 + Additional Asymptomatic Relative to Symptomatic Additional Asymptomatic Relative to Symptomatic[Rgn] = ZIDZ ( Total Asymptomatic Fraction Net[Rgn] -exp ( -Covid Acuity[Rgn] ) , 1 -Total Asymptomatic Fraction Net All Recovery[Rgn] = Recovery of Confirmed[Rgn] + Recovery of Untested[Rgn] + sum ( Hospital Discharges Expected Positive Poisson Covid Patients[Rgn] + Sqrt ( Expected Positive Poisson Covid Patients[Rgn] ^ 2 + 4 * Effective Hospital Capacity[Rgn] * Effective Hospital Capacity[Rgn] ) ) / ( 2 * Effective Hospital Capacity Allocated Fration NonCOVID Hospitalized[Rgn] = SMOOTHI ( Allocated Fraction COVID Hospitalized = min ( maxAlp , ialp * alpR = min ( 1, talp * alpR[Rgn] ) Units: dmnl 17) alpR Area of Region[Rgn] = GET VDF CONSTANTS(Constant Data File Average Acuity of Positively Tested[Rgn] * XIDZ ( ( 1 -Probability of Missing Acuity Signal at Hospitals[Rgn,Tested] * Fraction Poisson not Hospitalized[Rgn,Tested] ^ 2) , 1 -Fraction Poisson not Hospitalized Average Acuity of Untested Poisson Subset[Rgn] * ( 1 -Probability of Missing Acuity Signal at Hospitals[Rgn,Notest] * Fraction Poisson not Hospitalized[Rgn,Notest] ^ 2) , 1 -Fraction Poisson not Hospitalized Average Acuity Not Hospitalized[Rgn,Notest] = ZIDZ ( Average Acuity Not Hospitalized Poisson[Rgn,Notest] * Infectious not Tested or in Hospitals Poisson Average Acuity Not Hospitalized[Rgn,Tested] = Average Acuity Not Hospitalized Poisson Average Acuity Not Hospitalized Poisson[Rgn,Tested] = Max ( 0, Probability of Missing Acuity Signal at Hospitals[Rgn,Tested] * Average Acuity of Positively Tested[Rgn] * Fraction Poisson not Hospitalized Probability of Missing Acuity Signal at Hospitals[Rgn,Notest] * Average Acuity of Untested Poisson Subset[Rgn] * Fraction Poisson not Hospitalized -Prob Missing Symptom[Rgn] * Fraction Interested not Tested[Rgn] ^ 2) , 1 -Fraction Interested not Tested[Rgn] , 2 * Prob Missing Symptom Average Acuity of Untested Poisson Subset[Rgn] = ZIDZ ( Poisson Subset Reaching Test Gate[Rgn] * Covid Acuity[Rgn] -Positive Tests of Infected[Rgn] * Average Acuity of Positively Tested[Rgn] , Poisson Subset Not Tested Passing Gate Positive Candidates Interested in Testing Poisson Subset Adj[Rgn] -Potential Test Demand from Susceptible Population[Rgn] , Positive Candidates Interested in Testing Poisson Subset Adj Base Fatality Rate for Unit Acuity Base Fatality Rate for Unit Acuity Net[Rgn] = INITIAL( Base Fatality Rate for Unit Acuity BaseError = 5 Units: Person Baseline Daily Fraction Susceptible Seeking Tests Demographic Impact on Fatality Relative to China[Rgn] * Base Fatality Rate for Unit Acuity Net Baseline Risk of Transmission by Asymptomatic[Rgn] = INITIAL( Baseline Transmission Multiplier for Untested Symptomatic * Multiplier Transmission Risk for Asymptomatic Net Baseline Transmission Multiplier for Untested Symptomatic = 1 Units: dmnl Bed per Square Kilometer[Rgn] = INITIAL( Nominal Hospital Capacity[Rgn] / Area of Region Beds per Thousand Population[Rgn] = GET VDF CONSTANTS(Constant Data File ZIDZ ( lnymix Deaths of Symptomatic Untested[Rgn] -Post Mortem Test Rate[Rgn] * Frac Post Mortem from Untreated Chronic Death Rate[Rgn] = GET VDF CONSTANTS(Constant Data File Chronic Impact on Fatality[Rgn] = INITIAL Cml Death Frac In Hosp[Rgn] = XIDZ ( Cumulative Deaths at Hospital Cml Death fraction in hospitals large enough = sum ( if then else ( Cml Death Frac In Hosp Cml Known Death Frac Hosp[Rgn] = XIDZ ( Cumulative Deaths at Hospital CmltErrPW = 2 Units: dmnl CmltPenaltyScl = 0 Units: dmnl Constant Data File :IS: 'CovidModelInputs -ConstantData Contacts Relative to Normal[Rgn] = min ( Voluntary Reduction in Contacts[Rgn] , Activities Allowed by Government Reaching Testing Gate[Rgn] -Symptomatic Infected to Testing[Rgn] -Untested symptomatic Infected to Hospital Excess Death Start Count[Rgn] = :NA:, 0, if then else ( Time >= Excess Death Start Count Flu Acuity * Covid Acuity Relative to Flu Net Covid Acuity Relative to Flu Net[Rgn] = INITIAL( Covid Acuity Relative to Flu Total Covid Hospitalized[Rgn] , Infectious not Tested or in Hospitals Poisson[Rgn] + Infectious Confirmed Not Hospitalized[Rgn] + Total Covid Hospitalized Cumulative Cases[Rgn] = INTEG( New Cases[Rgn] , 0) Units: Person Cumulative Confirmed Cases[Rgn] = Infectious Confirmed Not Hospitalized[Rgn] + Hospitalized Infectious[Rgn,Tested] + Cumulative Deaths of Confirmed Confirmed Recovered[Rgn] + Cumulative Recovered at Hospitals Cumulative Death Fraction Cumulative Deaths at Hospital[Rgn,TstSts Cumulative Deaths of Confirmed[Rgn] = Cumulative Deaths at Hospital[Rgn,Tested] + Cumulative Deaths of Confirmed Untreated Cumulative Deaths of Confirmed Untreated[Rgn] = INTEG( Deaths of Confirmed Cumulative Deaths Untested Untreated Cumulative Fraction Total Cases Hospitalized[Rgn] = ZIDZ ( sum ( Cumulative Deaths at Hospital[Rgn,TstSts!] + Cumulative Recovered at Hospitals[Rgn,TstSts!] + Hospitalized Infectious[Rgn,TstSts!] ) , Cumulative Cases Cumulative Missed Death[Rgn] = INTEG( Count Missed Death[Rgn] , 0) Units: Person Cumulative Negative Tests Cumulative Recovered at Hospitals[Rgn,TstSts] = INTEG( Hospital Discharges Cumulative Recoveries[Rgn] = INTEG( All Recovery Cumulative Tests Conducted Cumulative Tests Data Current Test Rate per Capita dalp = 0.1 Units: dmnl Data Excess Deaths[Rgn] = GET VDF CONSTANTS(Constant Data File , 'DataConstants[Rgn]', 15) Units: Person Data Pseudo Case Fatality RAW: := DataCmltInfection RAW: := DataCmltDeath RAW: Units: Person/Day RAW: := DataFlowInfection RAW: := DataFlowDeath RAW: := DataTestRate RAW: Units: Person/Day Cumulative Confirmed Cases New Testing Time ) ) ) Units: Person/Day DataLimitFromTime = if then else ( Time > Max Time Data Used , 0, 1) Units: dmnl = 365 Units: Day/Year Death Rate[Rgn] = Deaths of Confirmed[Rgn] + Deaths of Symptomatic Untested[Rgn] + sum ( Hospitalized Infectious Deaths Tested Untreated Resolution[Rgn] * Fatality Rate Untreated Post-Detection Phase Resolution Time" * Fatality Rate Untreated Delay Order = 1 Units: dmnl Demographic Impact on Fatality Relative to China[Rgn] = GET VDF CONSTANTS(Constant Data File Discount Rate Annual = 0.03 Units: 1/Year Discount Rate per Day = INITIAL( Discount Rate Annual / Days per Year ) Units: 1/Day Dread Factor in Risk Perception Dread Factor in Risk Perception Net[Rgn] = if then else ( Response Policy Time On < Time , Dread Factor Policy * Response Policy Weight[dfcP] + ( 1 -Response Policy Weight[dfcP] ) * Dread Factor in Risk Perception[Rgn] , Dread Factor in Risk Perception Dread Factor Policy = 3000 Units: dmnl Effective Hospital Capacity[Rgn] = Nominal Hospital Capacity[Rgn] * Normalized Hospital density[Rgn] ^ Impact of Population Density on Hospital Availability Excess Death End Count[Rgn] = GET VDF CONSTANTS(Constant Data File Excess Death Mean Frac = 0.9 Units: dmnl Excess Death Range Frac = 0.2 Units: dmnl ZIDZ ( Cumulative Missed Death[Rgn] -Excess Death Mean Frac * Data Excess Deaths[Rgn] , Excess Death Range Frac * Data Excess Deaths Excess Death Start Count[Rgn] = GET VDF CONSTANTS(Constant Data File Expected Positive Poisson Covid Patients[Rgn] = sum ( Potential Hospital Demand[Rgn,TstSts!] ) * "Post-Detection Phase Resolution Time" Units: Person 122) expnt External Test Rate[Rgn] = Population[Rgn] * Policy Test Rate Utility from Normal Activities[Rgn] / ( Utility from Limited Activities[Rgn] + Utility from Normal Activities Units: dmnl Baseline Fatality Multiplier[Rgn] * Impact of Treatment on Fatality Rate[Rgn] * Average Acuity Hospitalized Baseline Fatality Multiplier[Rgn] * Average Acuity Not Hospitalized Poisson[Rgn,TstSts] ^ Sensitivity of Fatality Rate to Acuity Net Final Test Rate Per Capita[Rgn] = INITIAL( Current Test Rate per Capita[Rgn] + Weight Max in Test Goal * ( Max Test Rate per Capita -Current Test Rate per Capita FINAL TIME = 250 Units: Day Flu Acuity Relative to Covid Deaths of Symptomatic Untested Fraction Covid Death In Hospitals Previously Tested[Rgn] = ZIDZ ( Hospitalized Infectious Deaths Fraction Covid Hospitalized Positively Tested[Rgn] = ZIDZ ( Hospitalized Infectious Fraction Interested not Tested[Rgn] = 1 -ZIDZ ( Total Test on Covid Patients[Rgn] , Positive Candidates Interested in Testing Poisson Subset Fraction Interseted not Correctly Tested[Rgn] = 1 -( 1 -Fraction Interested not Additional Asymptomatic Relative to Symptomatic[Rgn] / ( 1 + Additional Asymptomatic Relative to Symptomatic Fraction of Fatalities Screened Post Mortem[Rgn] = Indicated Fraction Post Mortem Testing[Rgn] * Switch for Government Response Fraction of Population Hospitalized for Covid[Rgn] = Total Covid Hospitalized Fraction Poisson not Hospitalized[Rgn,Tested] = exp ( -Average Acuity of Positively Tested[Rgn] * ( 1 -Probability of Missing Acuity Signal at Hospitals Fraction Poisson not Hospitalized[Rgn,Notest] = exp ( -Average Acuity of Untested Poisson Subset[Rgn] * ( 1 -Probability of Missing Acuity Signal at Hospitals Fraction Seeking Test Fraction Tests Positive[Rgn] = ZIDZ ( Positive Tests of Fraction Tests Positive Data[Rgn] = min ( 1, ZIDZ ( DataFlowInfection[Rgn] , Active Test Rate[Rgn] ) ) Units: dmnl 149) Fractional Value of Limited Activities = 0.5 Units: dmnl Global Cases = sum ( Cumulative Cases[Rgn!] ) Units: Person Global Deaths = sum ( Cumulative Deaths[Rgn!] ) Units: Person Global Deaths , Global Cases ) Units: dmnl Government Response Start Time Hazard of Symptomatic Infection[Rgn] = Infection Rate[Rgn] / Susceptible[Rgn] * ( 1 -Total Asymptomatic Fraction Net Herd Immunity Fraction = 0.8 Units: dmnl Hospital Admission Infectious[Rgn,TstSts] = Hospital Admits All Hospital Admit Ratio[Rgn,TstSts] = XIDZ ( Hospital Admits All Hospital Admits All[Rgn,Tested] = Hospital Demand from Tested[Rgn] * Allocated Fraction COVID Hospitalized Hospital Admits All[Rgn,Notest] = Hospital Demand from Not Tested[Rgn] * Allocated Fraction COVID Hospitalized Poisson Subset Not Tested Passing Gate[Rgn] * ( 1 -exp ( -Average Acuity of Untested Poisson Subset[Rgn] * ( 1 -PMAS Unconfirmed for Hospital Demand Positive Tests of Infected[Rgn] * ( 1 -exp ( -Average Acuity of Positively Tested[Rgn] * ( 1 -PMAS Confirmed for Hospital Demand = ( 1 -Fatality Rate Treated[Rgn,TstSts] ) * Hospital Outflow Covid Positive Hospital Outflow Covid Positive[Rgn,TstSts] = Hospitalized Infectious[Rgn,TstSts] / Hospitalized Resolution Time Units Cumulative Deaths at Hospital Hospitalized Infectious Deaths Hospitalized Infectious Deaths[Rgn,TstSts] = Fatality Rate Treated[Rgn,TstSts] * Hospital Outflow Covid Positive Hospitalized Resolution Time = 20 Units: Day ZIDZ ( sum ( Hospitalized Infectious Deaths[Rgn,TstSts!] ) , sum ( Hospital Outflow Covid Positive ZIDZ ( sum ( Cumulative Deaths at Hospital[Rgn,TstSts!] ) , sum ( Cumulative Deaths at Hospital[Rgn,TstSts!] + Cumulative Recovered at Hospitals Units: Day Flu Acuity * ( Prob Missing Symptom Indicated Fraction Post Mortem Testing[Rgn] = Fraction Covid Death In Hospitals Previously Tested[Rgn] ^ Sensitivity Post Mortem Testing to Capacity Indicated Risk of Life Loss[Rgn] = min ( 1, Switch for Government Response[Rgn] * Perceived Hazard of Infection[Rgn] * Dread Factor in Risk Perception Net[Rgn] * PseudoCFR / Discount Rate per Day ) Units: dmnl Symptomatic Infected to Testing[Rgn] -Untested symptomatic Infected to Hospital[Rgn] , 0) Units: Person 180) Transmission Multiplier Presymptomatic[Rgn] + Infected pre Detection[Rgn] * Transmission Multiplier Pre Detection[Rgn] + ( Additional Asymptomatic Post Detection[Rgn] + "Poisson Not-tested Asymptomatic Transmission Multiplier for Confirmed[Rgn] + sum ( Hospitalized Infectious[Rgn,TstSts!] * Transmission Multiplier for Hospitalized[Rgn,TstSts!] ) ) * Reference Force of Infection Infectious not Tested or in Hospitals Poisson Constant Data File , 'DataConstants[Rgn]', 1) Units: Person 187) INITIAL TIME = 30 Units Known death fraction in hospitals large enough = sum ( if then else ( Cml Known Death Frac Hosp[Rgn!] < MinHspDTreshAdv , 1, 0) * AdvCntrs Liver Disease Impact on Fatality[Rgn] = INITIAL( ( Liver Disease Rate Liver Disease Rate[Rgn] = GET VDF CONSTANTS(Constant Data File , 'DataConstants[Rgn]', 18) Units: dmnl Max Test Rate per Capita = 0.001 Units: 1/Day Max Time Data Used = 247 Units: Day 200) maxAlp = 1 Units: dmnl Units: Person MaxRTresh = 8 Units: dmnl Constant Data File , 'MeanChronic', 1) Units: dmnl Constant Data File , 'MeanLiver', 1) Units: dmnl Constant Data File , 'MeanObesity', 1) Units: dmnl Min Contact Fraction MinAdjT = 1 Units: Day Multiplier Recent Infections to Test Multiplier Transmission Risk for Asymptomatic Multiplier Transmission Risk for Asymptomatic Net[Rgn] = INITIAL( Multiplier Transmission Risk for Asymptomatic + alp Negative Test Results New Cases[Rgn] = Infection Rate New Testing Time = 250 Units: Day Nominal Hospital Capacity[Rgn] = INITIAL( Initial Population[Rgn] * Beds per Thousand Population[Rgn] / 1000 ) Units: Person Normalized Hospital density[Rgn] = INITIAL( Bed per Square Kilometer[Rgn] / Reference Hospital Density ) Units: dmnl Not too few susceptibles = sum ( if then else ( SuscFrac[Rgn!] < MinSuscTresh , 1, 0) ) Units: dmnl Sens Obesity Impact Net Infection Rate[Rgn] , Incubation Period ,0, Delay Order ) Units: Person/Day 229) Onset to Detection Delay = 5 Units Overall Death Fraction[Rgn] = ZIDZ ( Death Rate[Rgn] , All Recovery[Rgn] ) Units: dmnl Patient Zero Arrival[Rgn] = if then else ( Time < Patient Zero Arrival Time Patient Zero Arrival Time Units: Person 234) payoff = 0 Units: dmnl 235) pdim : tstP,dfcP,dgtP,scuP Perceived Hazard of Infection[Rgn] = ( Weight on Reported Probability of Infection[Rgn] * Reported Hazard of Infection[Rgn] + ( 1 -Weight on Reported Probability of Infection[Rgn] ) * Hazard of Symptomatic Infection Indicated Risk of Life Loss[Rgn] -Perceived Risk of Life Loss[Rgn] ) / if then else ( Indicated Risk of Life Loss[Rgn] > Perceived Risk of Life Loss PMAS Confirmed for Hospital Demand[Rgn] = ( 1 -Reference COVID Hospitalization Fraction Confirmed[Rgn] ) ^ ( 1 / Average Acuity of Positively Tested PMAS Confirmed for Hospital Demand[Rgn] + ( 1 -PMAS Confirmed for Hospital Demand[Rgn] ) * Untested PMAS Gap with Tested Infectious not Tested or in Hospitals Poisson[Rgn] * exp ( -Average Acuity Not Hospitalized Poisson Poisson Subset Not Tested Passing Gate[Rgn] = Poisson Subset Reaching Test Gate[Rgn] -Positive Tests of Infected Poisson Subset Reaching Test Gate[Rgn] = Reaching Testing Gate[Rgn] / ( 1 + Additional Asymptomatic Relative to Symptomatic Policy Test Rate[Rgn] = if then else ( Time < New Testing Time , Current Test Rate per Capita[Rgn] , Final Test Rate Per Capita Infected Unconfirmed Post-Detection"[Rgn] + Susceptible[Rgn] + Recovered Unconfirmed[Rgn] + Confirmed Recovered[Rgn] + Infectious Confirmed Not Hospitalized[Rgn] + "Pre-Symptomatic Infected Positive Candidates Interested in Testing Poisson Subset[Rgn] = Poisson Subset Reaching Test Gate[Rgn] * Fraction Seeking Test Positive Candidates Interested in Testing Poisson Subset Adj[Rgn] = Max ( 0.001 * Potential Test Demand from Susceptible Population[Rgn] , Positive Candidates Interested in Testing Poisson Subset Positive Testing of Infected Untreated[Rgn] = Positive Tests of Infected[Rgn] * Fraction Poisson not Hospitalized Positive Tests of Infected[Rgn] = Positive Candidates Interested in Testing Poisson Subset[Rgn] * ( 1 -Fraction Interseted not Correctly Tested Post Mortem Test Rate[Rgn] = Post Mortem Tests Total[Rgn] * Sensitivity of COVID Test Units Post Mortem Test Untreated[Rgn] = Post Mortem Test Rate[Rgn] * Frac Post Mortem from Untreated Deaths of Symptomatic Untested[Rgn] + Hospitalized Infectious Deaths[Rgn,Notest] ) * Fraction of Fatalities Screened Post Mortem Post Mortem Tests Total[Rgn] = min ( Post Mortem Testing Need[Rgn] , Active Test Rate[Rgn] ) Units: Person/Day 255 Hospitalized Infectious[Rgn,Notest] / MinAdjT , Post Mortem Test Rate[Rgn] * ( 1 -Frac Post Mortem from Potential Hospital Demand[Rgn,Notest] = Hospital Demand from Not Tested Potential Hospital Demand[Rgn,Tested] = Hospital Demand from Recovered Unconfirmed[Rgn] + Cumulative Recovered at Hospitals[Rgn,Notest] ) * ( Baseline Daily Fraction Susceptible Seeking Tests[Rgn] * Fraction Seeking Test[Rgn] + Multiplier Recent Infections to Test Infection Rate[Rgn] + Patient Zero Arrival[Rgn] -Onset of Symptoms[Rgn] , 0) Units: Person 261) PriorEndoAve : UpAdj MTrAsym 264) Priors : UpAdj Prob Missing Symptom Probability of Missing Acuity Signal at Hospitals[Rgn,Tested] = ZIDZ ( ln Average Acuity of Untested Poisson Subset[Rgn] ) + 1 Units: dmnl 268) PseudoCFR Units: dmnl Effective Reproduction Rate[Rgn] = ZIDZ ( Infection Rate[Rgn] , Total Weighted Infected Population Reaching Testing Gate Realistic R0 = sum ( if then else ( R Effective Reproduction Rate Recovery of Untested[Rgn] , 0) Units: Person 274) Recovery of Confirmed[Rgn] = Tested Untreated Resolution[Rgn] * ( 1 -Fatality Rate Untreated Post-Detection Phase Resolution Time" ) -Deaths of Symptomatic Untested Reference COVID Hospitalization Fraction Confirmed Units: Person/(Km*Km) RegionalInputs[RFI,Rgn] = Reference Force of Infection Sensitivity Post Mortem Testing to Capacity Baseline Daily Fraction Susceptible Seeking Tests Weight on Reported Probability of Infection Multiplier Recent Infections to Test Min Contact Fraction Confirmation Impact on Contact Impact of Population Density on Hospital Availability Impact of Treatment on Fatality Rate Log ( Dread Factor in Risk Perception Reference COVID Hospitalization Fraction Confirmed Base Fatality Rate for Unit Acuity Net Covid Acuity Relative to Flu Net Sensitivity of Fatality Rate to Acuity Net Total Asymptomatic Fraction Net Sens Obesity Impact Net Sens Chronic Impact Net Sens Liver Impact Net Multiplier Transmission Risk for Asymptomatic Net Relative Risk of Transmission by Hospitalized = 1 Units: dmnl Relative Risk of Transmission by Presymptomatic = 1 Units: dmnl Reported Hazard of Infection[Rgn] = Positive Tests of Infected Response Policy Time On = 1000 Units: Day Response Policy Weight[pdim] = 0 Units: dmnl 307) Rgn : Albania SAVEPER = 1 Units: Day Sens Chronic Impact = 1e-06 Units: dmnl Sens Chronic Impact Net[Rgn] = INITIAL( Sens Chronic Impact * ( 1 -SW Gen Sens Liver Impact = 1e-06 Units: dmnl Sens Liver Impact Net[Rgn] = INITIAL( Sens Liver Impact * ( 1 -SW Gen Sens Obesity Impact = 1e-06 Units: dmnl Sens Obesity Impact Net[Rgn] = INITIAL( Sens Obesity Impact * ( 1 -SW Gen SensCovidUntestedAdmission = 1 Units: dmnl Sensitivity of Contact Reduction to Utility Sensitivity of Contact Reduction to Utility Policy * Response Policy Weight[scuP] + ( 1 -Response Policy Weight[scuP] ) * Sensitivity of Contact Reduction to Utility Sensitivity of Contact Reduction to Utility Policy = 10 Units: dmnl Sensitivity of COVID Test = 0.7 Units: dmnl Sensitivity of Fatality Rate to Acuity Sensitivity of Fatality Rate to Acuity Net[Rgn] = INITIAL( Sensitivity of Fatality Rate to Acuity Sensitivity Post Mortem Testing to Capacity ^ CmltErrPW ) / ( BaseError + DataCmltOverTime Sim Pseudo Case Fatality[Rgn] = ZIDZ ( Cumulative Deaths of Confirmed[Rgn] , Cumulative Confirmed Cases SimCmltOverTime[Rgn,Infection] = Cumulative Confirmed Cases Post Mortem Test Rate[Rgn] + Positive Tests of Infected Post Mortem Test Rate Total Simulated Tests = if then else ( FlowResiduals[Rgn,Series] = :NA:, :NA:, FlowResiduals Units: dmnl 0 Units: dmnl 342) Switch for Government Response[Rgn] = if then else ( Time > Government Response Start Time Symptomatic Fraction in Poisson[Rgn] = INITIAL( 1 -exp ( -Covid Acuity[Rgn] ) ) Units: dmnl Symptomatic Fraction Negative Symptomatic Infected to Testing[Rgn] = Positive Testing of Infected Untreated[Rgn] + Hospital Admission Infectious Testing Capacity Net of Post Mortem Tests[Rgn] = Active Test Rate Positive Candidates Interested in Testing Poisson Subset[Rgn] * Symptomatic Fraction in Poisson[Rgn] + Potential Test Demand from Susceptible Population[Rgn] * Symptomatic Fraction Negative Testing on Living[Rgn] = min ( Testing Capacity Net of Post Mortem Tests Indicated fraction negative demand tested[Rgn] * Potential Test Demand from Susceptible Population[Rgn] , Indicated fraction negative demand tested[Rgn] * Potential Test Demand from Susceptible Population[Rgn] + Indicated fraction positive demand tested[Rgn] * Positive Candidates Interested in Testing Poisson Subset Tests Per Million[Rgn] = Cumulative Tests Data[Rgn] / Initial Population TIME STEP = 0.25 Units: Day Time to Adjust Testing = 30 Units: Day Time to Downgrade Risk Time to Downgrade Risk Policy * Response Policy Weight[dgtP] + ( 1 -Response Policy Weight[dgtP] ) * Time to Downgrade Risk Time to Downgrade Risk Policy = 300 Units: Day Time to Herd Immunity[Rgn] = XIDZ ( Herd Immunity Fraction * Susceptible Time to Upgrade Risk Total Asymptomatic Fraction Total Asymptomatic Fraction Net[Rgn] = INITIAL( Total Asymptomatic Fraction Total Covid Hospitalized[Rgn] = sum ( Hospitalized Infectious Total Disease Duration = Onset to Detection Delay + "Post-Detection Phase Resolution Time" + Incubation Period Units: Day Total Simulated Tests[Rgn] = Post Mortem Tests Total[Rgn] + Testing on Living Total Test on Covid Patients[Rgn] = Max ( 0, min ( Positive Candidates Interested in Testing Poisson Subset Total to Official Cases Simulated[Rgn] = ZIDZ ( Cumulative Cases[Rgn] , SimCmltOverTime[Rgn,Infection] ) Units: dmnl Pre-Symptomatic Infected Transmission Multiplier for Confirmed[Rgn] = INITIAL( Baseline Transmission Multiplier for Untested Symptomatic * Confirmation Impact on Contact Transmission Multiplier for Hospitalized[Rgn,TstSts] = INITIAL( Baseline Transmission Multiplier for Untested Symptomatic * Relative Risk of Transmission by Hospitalized * if then else ( TstSts = 1, Confirmation Impact on Contact Transmission Multiplier Pre Detection[Rgn] = INITIAL( Baseline Transmission Multiplier for Untested Symptomatic * ( 1 -Total Asymptomatic Fraction Net[Rgn] ) + Total Asymptomatic Fraction Net[Rgn] * Baseline Risk of Transmission by Asymptomatic Baseline Transmission Multiplier for Untested Symptomatic * Relative Risk of Transmission by Presymptomatic ) * ( 1 -Total Asymptomatic Fraction Net[Rgn] ) + Total Asymptomatic Fraction Net[Rgn] * Baseline Risk of Transmission by Asymptomatic[Rgn] * Relative Risk of Transmission by Presymptomatic ) Units: dmnl Active Test Rate[Rgn] Units: Person/Day 378) TstSts : Tested Untested PMAS Gap with Tested[Rgn] = ( 1 -Allocated Fration NonCOVID Hospitalized[Rgn] ) ^ SensCovidUntestedAdmission Units: dmnl Untested symptomatic Infected to Hospital[Rgn] = Hospital Admission Infectious Utility from Limited Activities[Rgn] = exp ( Sensitivity of Contact Reduction to Utility Net[Rgn] * Fractional Value of Limited Activities ) Units: dmnl Utility from Normal Activities[Rgn] = exp ( Sensitivity of Contact Reduction to Utility Net[Rgn] * ( 1 -Perceived Risk of Life Loss Voluntary Reduction in Contacts[Rgn] = F[Rgn] / F0[Rgn] * ( 1 -Min Contact Fraction[Rgn] ) + Min Contact Fraction ZIDZ ( sum ( Average Acuity Hospitalized[Rgn,TstSts!] * Hospitalized Infectious[Rgn,TstSts!] ) , sum ( Hospitalized Infectious Weather Effect on Transmission Weight Max in Test Goal = 0 Units: dmnl Infected Unconfirmed Post-Detection"[Rgn] + Infectious Confirmed Not Hospitalized[Rgn] + sum ( Hospitalized Infectious[Rgn,TstSts!] ) * "Post-Detection Phase Resolution Time" / Hospitalized Resolution Time Units Spread of SARS-CoV-2 in the Icelandic population Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (COVID-19): A multi-center study in Wenzhou city Detection of SARS-CoV-2 in different types of clinical specimens Sensitivity of chest CT for COVID-19: comparison to RT-PCR Stability issues of RT-PCR testing of SARS-CoV-2 for hospitalized patients clinically diagnosed with COVID-19 Weather Conditions and COVID-19 Transmission: Estimates and Projections. medRxiv Clinical Characteristics of Coronavirus Disease 2019 in China Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention COVID-19 exacerbating inequalities in the US Estimating the infection and case fatality ratio for coronavirus disease (COVID-19) using age-adjusted data from the outbreak on the Diamond Princess cruise ship Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases In The United States. Health Aff (Millwood) Suppression of COVID-19 outbreak in the municipality of Vo Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship Universal Screening for SARS-CoV-2 in Women Admitted for Delivery Incubation Period and Other Epidemiological Characteristics of 2019 Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data Modeling the rework cycle: capturing multiple defects per task Data analysis using regression and multilevel/hierarchical models Accelerating Markov chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling An interactive web-based dashboard to track COVID-19 in real time Simulation-based Estimation of the Early Spread of COVID-19 in Iran: Actual versus Confirmed Cases Estimates of the severity of coronavirus disease 2019: a model-based analysis. The Lancet infectious diseases Serial interval of COVID-19 among publicly reported confirmed cases Serial interval of novel coronavirus (COVID-19) infections An interactive web-based dashboard to track COVID-19 in real time