key: cord-351941-fgtatt40 authors: Ghaffarzadegan, Navid; Rahmandad, Hazhir title: Simulation‐based estimation of the early spread of COVID‐19 in Iran: actual versus confirmed cases date: 2020-07-06 journal: Syst Dyn Rev DOI: 10.1002/sdr.1655 sha: doc_id: 351941 cord_uid: fgtatt40 Understanding the state of the COVID‐19 pandemic relies on infection and mortality data. Yet official data may underestimate the actual cases due to limited symptoms and testing capacity. We offer a simulation‐based approach which combines various sources of data to estimate the magnitude of outbreak. Early in the epidemic we applied the method to Iran's case, an epicenter of the pandemic in winter 2020. Estimates using data up to March 20th, 2020, point to 916,000 (90% UI: 508 K, 1.5 M) cumulative cases and 15,485 (90% UI: 8.4 K, 25.8 K) total deaths, numbers an order of magnitude higher than official statistics. Our projections suggest that absent strong sustaining of contact reductions the epidemic may resurface. We also use data and studies from the succeeding months to reflect on the quality of original estimates. Our proposed approach can be used for similar cases elsewhere to provide a more accurate, early, estimate of outbreak state. © 2020 System Dynamics Society The 2019 novel Coronavirus (SARS-Cov-2), the pathogen that causes COVID-19 infection, is exposing the world to one of its largest global health challenges of recent decades. From public understanding to policy choices, much depends on the data about the epidemics spread and models that integrate such data into actionable policies (Kaplan et al., 2002; Thompson et al., 2008) . Yet the official data are highly uncertain, with large variations in quality depending on the country reporting it, and regularly offering lower bounds that have unknown error bounds compared to the reality on the ground. Other sources of data, often based on smaller samples, travel screening, and anecdotal evidence may offer relevant hints, but they are not easily generalized or combined with official data. The current paper focuses on using a standard dynamic epidemiological model as a tool for incorporating various sources of data into a unified estimation of the actual trajectory of disease, applying the method to COVID-19 outbreak in Iran. Our focus is on how well one can realistically estimate the magnitude of the epidemic in early stages when disease parameters are uncertain, data are limited, and projections are highly volatile. Current information (as of May 15, 2020) point to the spread of COVID-19 starting from a food market in Wuhan China in mid-November 2019. The epidemic initially spread mostly in China but has become a global pandemic by May with over 4 million official cases and 280 thousand deaths (Worldmeters, 2020) . The situation may also be worse than official statistics portray. For example, despite a massive screening and response campaign in China, Li and colleagues estimate that 86% of early cases in China were undocumented . The situation in Iran, another early epicenter, is instructive. The first official cases were reported on February 19th, 2020, in the city of Qom (Wikipedia, 2020 ). Yet, later reports suggested the disease was likely circulating in Iran as early as January 2020 (Iran International, 2020) . Thus from February 19th on the spread of disease was very rapid according to official statistics. By March 20th some 19,644 cases and 1,433 deaths were recorded. The question in the minds of policy makers and public was: how far will this epidemic go and what policies should be put in place to control and mitigate the risks? The answers heavily depend on understanding the true magnitude of the epidemic. Any under-estimation is worrisome, but early in the exponential growth phase of an epidemic such errors could be extremely costly as those estimates drive the mobilization of public health resources and government responses. The risk of under-estimation is partly driven by the characteristics of the disease such as a potentially large population of unrecognized patients with mild symptoms (at least 80% of the cases have symptoms not very different from common cold or flu) (Novel Coronavirus Pneumonia Emergency Response Epidemiology, 2020). As a result no consensus also exists on other parameters of the disease, such as fatality rate (Fauci et al., 2020; Wu and McGoogan, 2020) . Moreover, the country-specific variations in measurement and reporting may exceed those due to the nature of the disease. To illustrate, in the next section we provide a quick survey of various early clues related to the magnitude of epidemic in Iran, which we then build on in our analysis. There have been a few clues from Iran which may inform efforts to estimate the true cases. First is the number of cases identified among travelers arriving in other countries from Iran. Screening of passengers from high-risk countries in airports is more reliable than most country-level screening statistics. One study in pre-print estimated 18,300 total cases of infected individuals in Iran by February 25 (Tuite et al., 2020) . The method relied on estimation of cases in the whole country based on three diagnosed cases of infection upon individuals' arrival from Iran in various international airports and the likelihood of such an incident given approximately 7500 daily outbound passengers from Iran during those early days (Fraser et al., 2009 ). This number was two orders of magnitude larger than official statistics at the time. Later reports on the number of infected travelers from Iran rose rapidly, to 97 cases by February 28 (RadioFarda, 2020 ). An article in The Atlantic, offered a series of back-of-the-envelope calculations which, with strong simplifying assumptions, estimated 2 million accumulated cases of infected individuals by March 9th (Wood, 2020) . One may expect the death statistics to be more reliable. But test kits for identifying have been in short supply, and post-mortem testing may not have been a priority of officials in Iran. On February 24th a member of Iran's parliament reported 50 deaths only in Qom, a city of 1.2 million (Wikipedia, 2020) . Several reports of government officials contracting the disease have also been released, including reports on infection of 20 member of the Iranian parliament (a body of 270 members), as well as several deaths among officials (BBCPersian, 2020b) . A BBC Persian report on February 28th used interviews with an unspecified number of hospitals in Iran to put the death from the disease at 210, an order of magnitude larger than official numbers at the time (BBCPersian, 2020a). Another news agency quoted similar sources for a total of 416 deaths by March 1st (Iran International, 2020) and 5000 by March 18th. We develop a dynamic simulation model of the spread of the disease in Iran to estimate the likely trajectory of the disease that is consistent with the evidence summarized above. We start with the traditional SEIR (for Susceptible, Exposed, Infectious, and Recovered stocks representing population groups) model and incorporate feedbacks regulating endogenous changes in contact rate, screening, diagnosis, and reporting in response to risk perception and other relevant factors. Thus not only reported statistics, but also the effective reproduction number (R e ), are endogenously generated and can change as people respond to the epidemic. We use this model, along with various strands of data, to weave together an estimate of disease trajectory early on, Simulation-based estimation of the early spread of COVID-19 in Iran 103 and offer projections for medium term future. Keeping in mind the various uncertainties involved in this analysis, reflected in our wide uncertainty intervals (UI), we estimate over 916,000 (90% UI: 508 K, 1.5 M) cumulative cases of the disease in Iran as of March 20th, with over 15,485 (90% UI: 8.4 K, 25.8 K) deaths. We thus estimate that only 2.1% of cases and 9.3% of deaths are officially attributed to COVID-19, with the rest going undetected. These results point to extreme gaps between official data and actual trajectory of disease, which may lead to slow response and under-appreciation of risks of the diseases. The rest of the paper is largely focused on documenting the original estimates developed on March 20th. As a secondary goal we also consider their accuracy in light of the data and findings in the two months following original estimates. This latter exercise provides a window into the actual challenges and promises of conducting policy-relevant projections early in a crisis. Figure 1 offers a simple representation of the model's structure. Model equations, and parameter values are documented in Appendix A. The model builds on the well-known SEIR (Susceptible, Exposed, Infectious, and Recovered) framework. The model includes the 'Infection' reinforcing loop (R1) which creates the initial exponential growth in the number of cases. We divide the population of infected to early infected and late infected, and assume that early infected, while largely asymptomatic, might also be infective though at a lower rate (Bai et al., 2020) . The infected population will follow two different paths of recovery or death. A fraction of late infected are symptomatic (in the Figure, the variable "symptomatic late infected"). To avoid proliferation of free parameters we use general population averages rather than disaggregating into population groups. To this basic epidemiological model we add two endogenous mechanisms. First, we formulate contact rate to be endogenously changing in response to perceived risk of infection (a function of death statistics). This feedback captures changes in social interactions, gatherings, self-isolation of suspected cases, and hygiene as well as government-mandated closure of events, schools, and businesses in response to perceived risk. We assume public risk perception depends on the number of recent reported cases of death. This balancing feedback (loop B) can bring down contact rate, potentially enough to slow down the epidemic. Second, we explicitly model the endogenous changes in screening and reporting of cases over time. Here, the increased understanding of risks leads to mobilization of screening resources that expand the fraction of cases that are tested and thus will show up in the official statistics. We note that even with good screening many mild cases will go undiagnosed and will not be in official statistics. This analysis is focused on understanding the spread of the disease in its early phases, about one month after first diagnosed cases, and the core estimations in this paper are based on data only from that early period, until March 20th, 2020. We use time series data for official reports of death, recovered, and cumulative number of infection over time. We also use unofficial data points including four observations about the number of Iranian passengers diagnosed with COVID-19 upon arrival in international airports, and three estimates aggregated by healthcare providers in Iran and reported by BBC and Iran International news agencies about total cases of death from COVID-19. The model includes three biological parameters (early-stage exposed period, average duration of illness, and fractional infectivity of early-stage exposed group) which we specify based on prior literature Novel Coronavirus Pneumonia Emergency Response Epidemiology, 2020; Wu and McGoogan, 2020) . Population size and travel scope are also input using existing data. Nine uncertain parameters remain that are estimated using the above data. Three of the parameters are used to specify how official measurement and reporting relates to "true" values of infection and death, three parameters estimate public reaction to the reports, and two parameters are for mortality rate among two different groups of patients. Finally, the arrival time of first cases of the virus is estimated as a separate parameter. Simulation-based estimation of the early spread of COVID-19 in Iran 105 Our calibration method is based on forming a likelihood function for observing the actual time series data conditional on model parameters. We then conduct a Markov Chain Monte Carlo (MCMC) simulation to estimate the joint posterior distribution of the model parameters subject to observed data. We define a likelihood function for change over time (net-inflow) of official reports on cumulative death, recovered and infection assuming they are count events drawn from model-predicted rates (Poisson distribution). We use a similar Poisson distribution assumption for number of infected passengers and unofficial reports of death as well, since they both fit well into a count measure framework. The MCMC method searches over the feasible ranges for the uncertain parameters and accepts combinations that are consistent with the observed data. Similar methods are used frequently in estimating dynamic models in this domain . Our prior experience with the use of MCMC methods in nonlinear dynamic models highlights the risk that uncertainty intervals do not capture model mis-specifications and thus may be too tight. We therefore downscale the likelihood function to expand the uncertainty intervals (details and sensitivity in the online supplement) to err on the side of caution in assessing uncertainties and structural nuances not explicitly modeled. More details are reported in Appendix A. We follow the procedure described in the modeling section and calibrate our model using multiple data sources and find uncertainty ranges using MCMC. Figure 2a compares best-fitting simulation results for official confirmed death, recovered, and cumulative cases of infection against data. Root mean squared percentage error values of reported recovered, death, and cumulative infected are 16%, 26%, and 5%, and root mean squared error values are 877, 116, and 794. Simulation runs almost perfectly correlate with data. Table A1 in the Appendix A reports estimated parameters along with those adopted from the literature. Figure 2b shows how simulation runs replicate unofficial statistics including the number of infected travelers diagnosed before air travel patterns changed substantially, and medical community's counts of cumulative death reported by media sources. We use the calibrated simulation model to estimate cumulative number of infected (panel c in Figure 2 ), cumulative number of death (panel d), and current number of infected (panel e) until March 20, 2020, starting from December 31st, 2019 (day 0 in simulation). We find that total infected cases may have been closer to 916,000, than the reported 19,644 on March 20th. Cumulative deaths, over 15,000, may also be an order of magnitude higher than official statistics and might have almost tripled in the last 10 days of the analysis. There is also much uncertainty in these numbers: our baseline estimate for cumulative infected may be as low as 508,000 and as high as 1.5 million (90% uncertainty interval). Similarly cumulative deaths is between 8400 and 25,800 (90% confidence interval). The wide uncertainty intervals are partly due to the nature of the data: absent data on testing coverage in Iran it is very hard to fully know the underlying diffusion patterns based only on the formal statistics. The travel and informal estimates of death partially address these limitations, but much uncertainty remains. Our conservative assumption on downscaling the likelihood function may also have contributed to this wide interval. Nevertheless, even our lower bound are 26 times the official statistics for total infections and 6 times larger than the death statistics. There are some glimmers of hope in our estimates as well. The drop in contact rate in response to perceived risk, which we have captured in our model, might have reduced R e significantly, to just below 1 as of late March (panel f). That reduction (which was confirmed by later data), if sustained, would help bring down the number of new cases. Thus, Figure 2e shows the number of currently infected cases may well be reaching its peak at the end of the simulation. The epidemic seems to have raged with a reproduction number close to 2.72 (90% UI: 2.57-2.92 UI) for more than six weeks before the behavioral and policy interventions have slowed down the diffusion rate starting in late February. Some of this drop may also be attributed to weather, but current research is mixed on the impact of weather Wang et al., 2020; Xu et al. , 2020) so we did not include it in our estimation (but would explore it in the forward projection in the next sections). Those factors, we estimate, brought down the aggregate contact rate to 26% (90% UI: 22%-31%) of pre-epidemic levels by the date of analysis. This number may just succeed in bringing down the reproduction number below 1, which is the necessary condition for containment of the epidemic. Whereas the case fatality rate in Iran, based on official statistics until March 20th, is 18.4%, our estimated infection fatality rate is 3.7 (0.4)% (standard errors for estimates are in parentheses). This estimate is consistent with, and somewhat higher than, the 2.3% reported by Wu and McGoogan (2020) based on tracing confirmed cases in China. If correct, the toll of epidemic in Iran would have been larger than that in most other countries by the estimation date, and limits to healthcare services may explain the increased death rate. This finding is reinforced by the limited coverage of testing in Iran that is indicated in our estimation. Many cases go un-confirmed, and at its maximum, formal testing is covering only 2.5 (0.9)% of infections. These estimates are consistent with qualitative media reports from Iran on very limited availability of testing, the doctors' need for getting authorization before conducting tests, and the multi-day delays in receiving test results that render them ineffective in the clinical decision making processes. We estimate that hospital sources of news organizations who have offered alternative versions of actual death statistics in Iran have had a much wider coverage of true death statistics (42 (15)% of estimated true values). Their count offers a lower bound for true deaths, is also close to our lower confidence bound, and suggests that the true number of cumulative cases, even with an infection fatality rate as high as 3.7%, would not be lower than 380,000. Nevertheless, we estimate that even those medical community reports offer coverage significantly below 100%, consistent with the lessthan-perfect coverage of hospital reports by media sources, and anecdotal evidence that hospital system has been overwhelmed in many hot spots of the disease and many patients have died at home and been buried with no testing or proper association of cause of death to COVID-19. The model used prior estimates from the literature for three important parameters: the total duration of illness (d = 14 days), the exposure (early infection) period (τ = 4 days), and the fractional infectivity of the exposure period (θ = 0.25). Noting significant uncertainty in such estimates early in an epidemic, we assess our projections' sensitivity to these structural assumptions by re-estimating the model for different values of each parameter. In this analysis we find a new set of best-fitting parameters and estimates for true cases given each different assumption on disease parameters. In Table 1 results are summarized as percentage changes in key estimates (cumulative infections and cumulative death until March 20th, 2020) compared to baseline best-fitting projections. Results suggest these structural parameters are important in identifying the true magnitude of the epidemic and may propagate additional uncertainty in our projections due to limited early data on their exact values. Specifically, a smaller fractional infectivity during exposure period (θ), a longer exposure period (τ), or a shorter duration of illness all significantly increase the projections for total cases and deaths. Reductions in the estimates are less significant when parameters move in the opposite direction, though a very high level of θ would bring down the estimated cumulative cases notably. In lower values of θ, model calibration suggests earlier arrival of first cases of infection, which, in a few days, can make a difference in the number of late-infected, and increase exposure rate. Current evidence for transmission during early infection (exposure) period is rather limited, and thus our baseline projections for total cases may be somewhat conservative. Nevertheless, given our conservative treatment of the uncertainty intervals all the best-fitting projections with different disease parameters remain within the 90% bounds of our baseline projections. We run the model until July 1st of 2020 under six scenarios (three alternative assumptions about the impact of seasonality times two policy measures). Specifically, our scenarios about seasonality include no effect (status quo), moderate effect (infectivity of the virus decreases linearly from May 1st and halves by June 1st, then stays the same for the rest of the simulation), and very strong mitigating effect (infectivity of the virus decreases from May 1st to a quarter of its base value by June 1st, then stays the same for the rest of the simulation). For policy, our focus is on contact rate, and we include two conditions of status quo (i.e. only behavioral responses) and aggressive efforts to decrease contact rate by half what it would be otherwise. These six conditions provide intuitions for a potentially wide range of cases. Both intervention and weather impacts are assumed to include rather strong options to inform the range of possibilities, and should not be interpreted as predictions. Moreover, we also note that the wide cone of uncertainty in the baseline simulations will continue and expand in future projections. Keeping the uncertainty considerations in mind (but not graphed due to clutter), Figure 3 shows results based on best-fit parameters. If our best case scenario on the reduction of contacts and benefits of seasonality for containment materialize, the number of infected cases would be expected to peak soon after the analysis date at approximately 494,000 (90% UI: 274 K-813 K), and will go down later on. This optimistic scenario will still lead to over a million infections and some 58,000 deaths (90% UI: 32 K-97 K) by the end of June. The less aggressive scenarios point to continued spread of the epidemic at large rates. A reduction in contagion is realized only when reproduction number remains below one as a result of reduced physical contact, government interventions, and seasonality. Among these three, the first (behavioral response) is endogenous to recent death rate. When death rate comes under control (due to the combination of all three factors), our model assumes the contract rate rebounds, weakening the first channel and increasing the transmission. Economic pressures, normalization of death, and behavioral modeling after those not practicing distancing (or those already recovered and thus presumably immune) could all weaken the behavioral response when the perceived risks fade. The increased transmission would then, with a delay, bring up the death rate close to levels sustaining a reproduction number around one. This creates a strong attractor in the dynamics where in steady state contact rate is high enough to sustain the contagion but not so high to lead to the rapid infection of all the population. It may also lead to oscillations in peak transmission. Therefore it is only with strong weather and/or government interventions that reduced reproduction number brings the epidemic under control. This dynamic offers a cautionary tale against declaring victory early in the fight against the epidemic. However the actual magnitude of such rebound effect is not known in this setting and our data offers no guidance on the relevant parameters. Therefore these results are only qualitatively suggestive but not quantitatively reliable. This research was conducted under extreme urgency to inform decisions amidst crisis. The primary objective of the analysis was to understand the magnitude of the outbreak during early stages of infection when uncertainties are large and major decisions at stake. The main analysis of this paper was completed on March 20th 2020 and first reported publicly on March 27th on medRxiv pre-print server. Thus we intentionally leave those projections untouched to avoid post-hoc justification and corrections that would cover the true uncertainties involved in real world modeling projects targeting fast evolving situations. Furthermore, given that March 20th marks the end of the Winter and beginning of Persian new year, emerging annual reports can be used to assess some of our findings. In this section we summarize a few such tests and observations made based on findings post initial analysis. First, our estimates can be partially tested against official all-cause mortality data which are reported seasonally by the Iranian government. The data for winter 2020 were released on May 7th, during the final preparations of this manuscript. Based on a regression analysis on the all-cause mortality data we find that COVID-19 related deaths are likely 7.1 (95% CI: 3.8, 10.4) times more than official death counts for the disease (see Appendix C). Our simulation-based estimates, suggesting 10.8 (90% CI: 5.9,18.0) times more deaths than official data, are consistent with this new information. A second study informed the true magnitude of epidemic in one of the provinces of Iran using anti-body testing. Specifically, testing for COVID-19 antibodies informs whether a person has been previously infected by the disease, offering estimates for cumulative infection when randomly conducted on a population. We have found one such study conducted on a random population of Iran's Guilan province (population of 2.5 million) (Shakiba et al., 2020) . These researchers estimate that 21-33% of Guilan residents were infected by late April. We did not simulate our model at the province level, but our March 20th estimate of 47 times undercounting, combined with 1090 confirmed cases in Guilan (0.04% of the residents), gives an estimate of 2% prevalence at the end of winter. Our model estimates about 50%-100% growth in cases over the following month (see Figure 3b ), so our estimate of cumulative infections in Guilan, at 3%-4% for late April, are still five to ten times lower than this new study. These findings would suggest that a rather high fraction of cases were asymptomatic and/or not included in any of the data sources for infection that we used (official data and infected traveler counts). Incidentally, if we assume that our estimates for infection were 7 times below true numbers, the infection fatality rate for COVID-19 in Iran goes down to close to 0.5%, a number much closer to some recent estimates for this key parameter (Basu, 2020) . Third, we can test model's capability in projecting confirmed cases. Although prediction of official cases was not the purpose of the model, we present this out-of-sample test as an indirect way of assessing the model's predictive power. The model was calibrated with 30 days of data, and since the pre-print, 55 days of more data on confirmed cases and deaths have become available, which we use to compare the model's predictive performance for official, reported cases. These results are provided in Appendix B, and show that while qualitative trends are consistent, the model misses a major increase in the number of cases right after the first 30 days. Confirmed death projections are closer to data. Two mechanisms may explain this gap. First, our simple formulation for testing only tracks the fraction of cases tested, saturating that fraction at a maximum. That formulation is fine in capturing growth in test capacity during the rise of epidemic, but would underestimate cases when the growth slows down and milder cases can be found with the already enhanced test capacity. Indeed since estimation we have found that initial test capacity in Iran grew to about 6000 tests per day, stayed at that level for the later part of our estimates, and then jumped to 10,000 tests per day; similar increasing patterns of tests are seen across nations globally. Thus, the testing formulation likely introduces under-counting in simulating official cases later in the epidemic, but this does not mean the model is under-predicting the actual infections and deaths. A second factor may explain why our model predicted an earlier peak, by about two weeks. Iran's new year holiday season, Nowruz, which starts on March 20th, is a major time of travel, family visits, and social interaction. Those activities may well have increased contact rates in later March, but we decided against post hoc introduction of a Nowruz parameter in the model. Finally, we note that our model predicted a resurgence of the epidemic in early May due to reduced risk perception, which is consistent with a rise in cases in the real world in May. It is harder to fully assess the impact of these discrepancies on future estimation of true cases, had we decided to extend those estimates beyond the original timeline. Nevertheless, it seems our model is somewhat conservative in predicting the 'reported' infections and deaths after the estimation period, and a more explicit account of physical test capacity would add value in longer term projections. Finally, upon the release of the first draft of the paper we have received reports from a few hospitals and officials in Iran that confirm: a) Unofficial death counts by hospital sources that continue to be about five times the official death statistics, and in some states of Iran are close to our estimates; and b) The low sensitivity (false negative risk) of tests, at about 50%, when conducted by hospital staff. Moreover, availability of tests even in major hospitals is limited, covering about 50% of those clinically diagnosed using other methods (often CT Scans). It is important to note that some evidence from random testing in other countries (including Iceland and Germany) point to a very high asymptomatic fraction of patients, which would not be captured in any of our data sources and is consistent with our comparisons with the Guilan study. Furthermore, new evidence suggests we might have used under-estimates for the delay between exposure to onset of symptoms (about 5 days), from onset to testing (another 5 days), and from testing to resolution (about 10 days) (Lauer et al., 2020; Linton et al., 2020) . Adjusting for these parametric assumptions would increase our estimated case count as discussed in the sensitivity analysis and may bring our overall estimates closer to the new evidence on the cumulative infections. In this paper we provide a more sobering picture of the COVID-19 outbreak in Iran using a dynamic model that goes beyond official statistics. Integrating data from various sources we suggest that the official statistics are at least an order of magnitude below actual spread of the epidemic, and even in optimistic scenarios the burden of disease will be large and lasting for many months. Implementing and sustaining strong policies that target physical distancing offers the main hope for containing the epidemic until a vaccine becomes available. The gap between official statistics and our estimates may be due to various complications in measurement as well as policy choices. The availability of testing infrastructure has been a major bottleneck for detecting cases in Iran. Citizens with suspicious symptoms have had no easy way for getting tested; most tests are conducted on hospitalized patients, require specific authorization, and are processed in a few centralized labs adding significant delays to the process. Not only mild cases are missed, but also many critical patients have been unable to access the care they need due to hospital congestion. The fact that majority of cases have mild symptoms similar to flu (Wu and McGoogan, 2020) adds to the risk of under-counting even if the testing capacity was ample. In fact many mild or moderately sick patients may have preferred to stay home than risk being infected upon visiting a congested hospital, further reducing demand for testing. Finally, false negative test results seem to be common among COVID-19 patients in Iran, with healthcare providers indicating sensitivity levels as low as 50% due to challenges in sample extraction and varying concentration of viral load over the course of disease. The problem of undercounting cases of infection and death is not limited to Iran. Due to the relatively long period of incubation, a considerable proportion of asymptomatic cases, significant fraction of false negatives in existing tests, and especially the limits on test capacity, the undercounting problem is common, but its magnitude would vary across different countries. Even with ample test capacity, the undercounting of infection is likely large due to both the asymptomatic patients (estimated to be more than 40% (Mizumoto et al., 2020; Nishiura et al., 2020) ) and those cases with false negative results (about 30% of positives in case of PCR based COVID-19 tests (Fang et al., 2020) ). For example a pre-print paper estimates that in the US less than 10% of cases of infection were diagnosed until March 13 (Pei and Shaman, 2020). Others have made similar claims about official data of China . The undercounting of deaths may be less in magnitude, but still very common. Under significant strain, limited testing capacity may not be allocated to post-mortem testing of suspected cases, and thus country-specific routines for categorizing death in those cases may lead to large variations. We suspect this is a major reason why Iran's official death counts may be significantly below the actual numbers. But other countries may have similar issues. For example a recent article estimates much higher number of deaths in Italy than official numbers (Stancati and Sylvers, 2020) . Moreover, recent investigative reporting by Financial Times, New York Times, Wall Street Journal, and Economist has used all-cause death data of different countries and cities and estimated that globally, as much as 50% of COVID-19 death cases are missed (Burn-Murdoch et al., 2020) . For example Italy's excess death in March 2020 is 24,600, out of which only 13,710 is attributed to COVID. Jakarta (Indonesia) had 3300 excess deaths in March and April, with only 381 attributed to COVID. Absent extended testing capacity, expanded post-mortem testing, and random testing in the population, policymakers may face a significant undercount of existing cases and thus miscalculate their response. We hope that our proposed approach can be used for similar cases elsewhere to de-bias such estimates. Methodologically, we faced important challenges to estimation and projection of s-shaped behavior based on early data that are worth highlighting here. First, estimating s-shaped curves before observing their inflection point can be unreliable. Minor changes in the slope of the cumulative curve can significantly change the estimated future trajectory when simple curve-fitting is pursued. Overcoming this risk requires a more mechanistic model that is fitted to the flow (infections in this case) rather than cumulative stock, so that the slope is constrained by physical mechanisms determining the flow variable (e.g. the size of the susceptible population). The calibration should also properly weigh different data points so that the larger numbers during peak period do not overwhelm the signal from other periods. In our case we achieved this using Poisson log-likelihood error. Other promising alternatives may include Negative Binomial log-likelihood, among others. A second challenge is teasing out the various potential reasons for reduction in new infections. For example those may include a reduction in susceptible population, a change in contact rate due to social distancing, or exogenous changes such as weather. While mechanistically proper model specification can partially address this challenge, one needs to be cognizant of the uncertainty in estimated parameters, e.g. by using MCMC methods to identify the various combinations of those effects that are broadly consistent with the data. The problem is rapidly changing, and timely estimation is critical. Therefore we have made several simplifying assumptions that readers should be aware of in interpreting our results. First, consistent with prior findings (Rahmandad and Sterman, 2008) we have focused on capturing behavioral feedbacks in contacts and testing and have adopted a formal estimation process, but have done so at the cost of abstracting away from much detail complexity and heterogeneity in populations and risks. We converged on the current model structure in an iterative process, attempting to keep the model simple and feasible to estimate, but not missing components that would qualitatively change the results. For example we tested the inclusion of healthcare capacity, test accuracy, and physical test capacity (rather than as a percentage of cases) and decided against including them explicitly because they did not change the conclusions qualitatively and our data did not offer reliable ways to estimate those effects beyond what is informed by our current structures. Generally, a more detailed model may be appealing, but if the purpose is to offer quantitative estimates and projections, constraining additional model parameters during calibration would be a major constraint that should be weighed against the value of additional structure. Second, during the early periods of infection, there is considerable uncertainty in most data sources. For example, infected passenger data, while useful, is a not a perfect data source for modeling. It can be argued that exposed to more travel, this population may be exposed earlier than a random sample. On the other hand, this is typically a higher income and more educated population that may be more aware of, and responsive to, infection risks. Similar issues can be raised for informal death reports in the media, which are based on potentially limited samples of hospitals and may include subjective assessments of medical cases. Such uncertainties are part of any emerging, fastmoving situation, and one should be cognizant of these factors in interpreting modeling results. Third the model is scoped only around Iran --we ignore potential effects of global spread of the disease on Iran, including the risk of reintroduction of cases in future. Forth, we assumed that the recovered population are immune during our simulation horizon. Fifth, we ignore mutations in the virus which may change its contagiousness and case fatality rate. Despite these limits we hope the paper offers an accurate assessment of the risks and scope of the epidemic for Iran and beyond. We also hope other researchers build on this work using the publicly available data, models, and replication instructions provided in the online supplement. Finally, we hope that the study contributes to the system dynamics modeling literature. There are a wide range of epidemic models with different levels of dynamic and detailed complexity (Darabi and Hosseinichimeh, 2020) and a large number of models have been built in response to COVID-19. While epidemic models with interesting and relevant feedback mechanisms could be built rather quickly, their results depend heavily on the parameters regulating the strength of contagion, behavioral response, testing, and other mechanisms. For example reputable research groups have offered models with forecasted cumulative deaths that vary significantly (Ferguson et al., 2020; Murray and IHME, 2020) . Without quantitative estimation methods, different models may point to starkly different conclusions, with little guidance on how to select the appropriate analysis. Our contribution is to offer a model-based estimation approach during an on-going pandemic where data are uncertain and limited. This requires combining all sources of available data with different levels of uncertainty, each one, potentially an imperfect indicator of the reality, yet collectively providing a better picture. Our data-driven approach is consistent with recent calls for detailed data-informed modeling approaches (Sterman, 2018) , and combining different methods and techniques to enrich dynamic modeling (Duintjer Tebbens and Thompson, 2018; Ghaffarzadegan and Larson, 2018) . We hope the methods used in this paper can be extended to a variety of other system dynamics applications where actionable insights rely on rigorous combination of feedback rich models and quantitative data. data sources to study dynamic problems in organizational and health domains. As the number of infected cases increase, the probability of infection increases given the increased number of contacts between infected and susceptible populations. Some of these contacts lead to new cases of infection (loops R1 and R2). Infections may be due to a fraction of early infected cases who can infect others (Bai et al., 2020) , and the late infected population. Exposure rate (i E ) is formulated as following: We set e at 1 (no seasonal effect) for historical calibration, and later for scenario analysis, check different functions that give lower values during summer. The equation has two degrees of freedom, and since the actual value of c and i are not independently knowable, we assume i = 0.02 and vary c to fit the model output against data (details below). Death and recovery rates are estimated using fractional death rate. We have, where the disease duration is assumed to be total of τ E (early-stage asymp- Contact rate can change in response to the progression of the disease and consequently change infection rate (see Figure A1 ). The impact of perceived risk on contact rate captures not only the endogenous changes in social interactions, gatherings, self-isolation of suspected cases, and hygiene, but also government mandated closure of events, schools, and businesses to name a few. We assume public risk perception depends on reported cases of death. Finding infected cases is challenging given that many cases remain mild and unknown, yet potentially infectious. Furthermore, especially early on, many cases were misdiagnosed as healthcare providers were unaware of the existence of the virus in the country. As the first cases are revealed and reported, a higher fraction of cases are found (Loop R3). There are three death-rate-related variables: actual death rate ( c is inversely related to O D . We bound c between 0 and c max , and use the following sigmoid function to represent c: ρ represents the impact of other policy measures beyond the perception of death that may impact c (e.g. various government interventions) and is set to 1 for base run simulations but experimented with for policy analysis. O Ã D and s represent the relation between O D and c. Conceptually O Ã D is a threshold for public report of death, at which contact rate endogenously declines to half of c max and s is the sensitivity of public behavior to the reported death rate. The free parameters, O Ã D ,s, and c max , are estimated. We assume only symptomatic cases of patients can be diagnosed. Let f symp < 1 be the fraction of late-infected individuals that develop symptoms. Let α be the fraction of symptomatic who are tested, then αf symp is the fraction of I L who will be tested. More severe cases are more likely to be diagnosed, so mortality rate of diagnosed cases(f 0 ), is likely to be higher than average fatality rate (f). Death rate of diagnosed individuals is f 0 (αf symp I L )/τ L , with the constraint αf symp f 0 ≤ f. We assume no post-mortem diagnosis. Including first cases of death discovery diagnosed early on (X): where D is the cumulative reported cases of death. X is equal to 2 on t 1 =February 19th and for the rest of the simulation it is zero. We set α = Min aD,1 À Á Á α max , and let the model estimate α max , and a through calibration (explained below). For reported cases of recovered (R), we use the following equations: is reported cases of recovery per day. This formulation assumes the reporting of recovered cases happens with the same rate (α) as the reporting of infections and death. While Iran's data are supportive of this assumption, exploration of statistics from other countries suggests many do not track the recovered cases. For total infected (I), cumulative cases (I cum ) and reported cumulative cases of infection (I cum ) we have: We set f symp = 0.8, but the model is not sensitive to the value of f symp , as any change in f symp will be compensated by α to match the simulation with Simulation-based estimation of the early spread of COVID-19 in Iran 123 the data, and calculate total fraction of infected who are diagnosed. The reproduction number is estimated asR = ( τ E + τ L )(i E /I). Besides the official data, we use a few unofficial data points: four observations about number of Iranian passengers diagnosed with COVID-19 upon arrival in international airports (P ), and three unofficial estimations from BBC and Iran International news sources about cumulative death from COVID-19 (D u ) early in the epidemic. In our model, we estimate them as following: where D u,0 = 0 , and μ is the ratio of number of passengers going from Iran abroad in early days of the outbreak. The term (t + 1 − t 1 ) represents days since the first cases of infection were diagnosed. It is assumed that the daily number of passengers is fixed and death is minimum, thus this estimation only works during the first few days of the outbreak. For D u , we assume γ represents the fraction of actual deaths identified by these sources. Note that the unofficial reports for death cases in media are based on unofficial reports from the medical community with limited samples, thus γ is likely below one. γ will be estimated through model calibration. We initialize the model as I E, 0 = I L, 0 = R 0 = D 0 = 0, and S 0 = 81 million people. We set R 0 = D 0 = 0. New infected cases are injected to the model through i F , at time t 0 , to be estimated through model calibration. Although we are not aware of number of foreign patients entering Iran, since t 0 is also unknown, we have two degrees of freedom, and one can be assumed and the other estimated through model calibration. We set i F equal to 100 patients per day at t 0 , and 0 throughout the rest of the simulation. We run the model from t = 0, December 31, 2019, and our time unit is a day. t 1 , the time first cases are diagnosed, is February 19, 2020. We have time series data for D, R, I cum , and a few data points for P and D u . Our method is mainly based on forming a likelihood function for observing the actual time series data conditional on model parameters. We then conduct a Markov Chain Monte Carlo (MCMC) simulation to estimate the joint posterior distribution of the model parameters subject to observed data. We define a likelihood function for change over time (net-flow) of D , R , I cum assuming they are count events drawn from model-predicted rates (Poisson distribution). We use a similar Poisson distribution assumption for P and D u as well, since they both fit well into a count measure framework. The MCMC method searches over the feasible ranges for nine uncertain parameters in our model. These include: three data-related parameters of α max , a, and γ; three public-behavior-related parameters of c max , O Ã D , and s; and three disease-related parameters of fractional death rate (f and f 0 ) and time for the injection of first cases of infection into the population (t 0 ). Prior experience and this exercise point to rather tight confidence intervals coming from MCMC methods directly applied to large nonlinear models. The problem could be addressed using filtering (e.g. extended Kalman filtering or particle filter) methods or heuristics that scale the likelihood function in MCMC method to be consistent with model's predictive performance. In light of the computational costs of the former and time sensitivity of the research topic, we chose the latter option. To be conservative, we therefore downgrade our confidence in the data, scaling the likelihood function by a factor of 0.1, which allows the algorithm to explore the parameter space much more freely. This approach is a heuristic we use to avoid putting too much confidence in our estimation results; however, without strong theoretical reasons to support the specific scaling factor, we caution against a strong interpretation of these results. With this caveat in mind, the MCMC method offers 90% confidence intervals for joint-distribution of the parameters. We conducted additional sensitivity analysis to this downscaling parameter choice. The rest of the parameters are specified based on the literature. the counter factual of "what is expected absent COVID-19" is. A simple counter-factual would compare deaths in previous winters with that in Winter 2020. Yet, depending on which previous years one includes in such counter-factual, excess deaths between 5 to 15 thousands could be estimated. However, the released death data is disaggregate at the province level (31 provinces). This provides a unique opportunity to more accurately estimate the excess deaths in Iran by comparing the variations in death in different provinces and how those variations could be explained by the official death toll in each providence. Formally, we run the following regression to estimate the undercounting of COVID-19 related deaths: ExcessDeath i = β 0 + β 1 :PastDeath i + β 2 :ConfCovidDeath i: where ExcessDeath i is the difference between death in Winter 2020 and Winter 2019 in province i, and PastDeath i is a vector of death in past seasons (e.g., Winter 2019, 2018, etc.) in province i. ConfCovidDeath i is an estimation for confirmed death cases based on confirmed total cases for each province which were available for the analysis (Iran does not provide province level death statistics for COVID, but did provide province level cumulative infection statistics; scaling total death based on province level infections we calculate "ConfCovidDeath"). β 2 will represent death undercounting factor. Our estimate of 15,485 (90% UI: 8.4 K, 25.8 K) death implies that this ratio should be around 10.8 (90% CI: 5.9,18.0). Our regression models are reported in Table A2 . Based on the best regression model (in terms of adjusted R-squared), we find that β 2 = 7.15 (95% CI: 3.85, 10.44). This ***p < 0.001, **p < 0.01, *p < 0.05. Presumed asymptomatic carrier transmission of COVID-19 Estimating the infection fatality rate among symptomatic COVID-19 cases in the United States: study estimates the COVID-19 infection fatality rate at the US county level Corona virus has at least claimed 210 lives in Iran Who, among political and government figures, have contracted the corona virus? Global coronavirus death toll could be 60% higher than reported. Financial Times System dynamics modeling in health and medicine: a systematic literature review Using integrated modeling to support the global eradication of vaccine-preventable diseases Sensitivity of chest CT for COVID-19: comparison to RT-PCR Covid-19 -navigating the uncharted Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand Pandemic potential of a strain of influenza a (H1N1): early findings SD meets OR: a new synergy to address policy problems The number of deaths from Corona virus in Iran reach 416 DB%8C%D8%B1%D8%A7%D9%86-%D8%A7%DB%8C%D9%86%D8%AA% D8%B1%D9%86%D8%B4%D9%86%D8%A7%D9%84-%D8%B4%D9%85% D8%A7%D8%B1-%D8%AC%D8%A7%D9%86%E2%80%8C%D8%A8%D8% A7%D8%AE%D8%AA%DA%AF%D8%A7%D9%86-%DA%A9%D8%B1%D9% 88%D9%86%D8%A7-%D8%AF%D8%B1-%D8%A7%DB%8C%D8%B1%D8% A7%D9%86-%D8%A8%D9%87-%DB%B4%DB%B1%DB%B6-%D9%86%D9% 81%D8%B1-%D8%B1%D8%B3%DB%8C%D8%AF Emergency response to a smallpox attack: the case for mass vaccination The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2) Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data The role of absolute humidity on transmission rates of the COVID-19 outbreak Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the diamond princess cruise ship Forecasting COVID-19 impact on hospital bed-days, ICUdays, ventilator-days and deaths by US state in the next 4 months Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. 2020. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China Initial simulation of SARS-CoV2 spread and intervention effects in the continental US WHO chief says 97 cases of coronavirus in 11 countries originated from Iran Heterogeneity and network structure in the dynamics of diffusion: comparing agent-based and differential equation models Seroprevalence of COVID-19 virus infection in Guilan Province Italy's coronavirus death toll is far higher than reported System dynamics at sixty: the path forward The risks, costs, and benefits of possible future global policies for managing polioviruses Estimation of COVID-2019 burden and potential for international dissemination of infection from Iran Temperature significant change COVID-19 transmission in 429 cities Wikipedia. 2020. 2020 coronavirus outbreak in Iran Iran has far more coronavirus cases than it is letting on. The Atlantics Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention Weather conditions and COVID-19 transmission: estimates and projections. medRxiv We are thankful to Mohammad Akbarpour, Narges Doratoltaj, Babak Heydari, Hamed Ghoddusi, and Tse Yang Lim for their thoughtful feedback on various drafts of this paper. Navid Ghaffarzadegan is an Associate Professor in the department of Industrial and Systems Engineering at Virginia Tech. He develops system dynamics simulation models to study complex social systems and policy problems. The main application areas of his research include science policy and health policy.Hazhir Rahmandad is an Associate Professor of System Dynamics at MIT Sloan School of Management. He combines simulation models and various Simulation-based estimation of the early spread of COVID-19 in Iran 117 Basic model architectureThe model uses a country-level system of differential equations to capture the dynamics of contagion ( Figure A1 ). The infected population is first asymptomatic and is represented as "early infected" and later becomes "late infected" with the majority being symptomatic.The following equations represent the aging chain:Appendix B: Out of sample prediction testOur main analysis in this paper is done based on data up to March 20th, 2020 (total of 30 days). Since then 55 days of new data have become available which we use to conduct an out-of-sample test of our model's projection for confirmed cases. Figure A2 shows the results. The model is better at replicating reported cases of death (panels c and d) than reported infection (panels a and b) and recovery (panels e and f). It appears that by increased testing, milder cases of infection are diagnosed in Iran than we predicted, while confirmed death rate is still in the same range as our model's prediction. In recovery data we see a peak on a specific day (panel f) that might relate to some follow-up or closing active cases. The data for all-cause death, which in Iran is reported seasonally, were released on May 7th, 2020, during the final preparation of this manuscript. Winter ends on March 20th, thus seasonal reports on all-cause number is lower than our estimate for death, but of comparable magnitude, and falls within our confidence intervals. It is worth noting that the COVID-related excess death may not be all attributed to COVID-19 alone. For example social distancing policies in response to COVID may reduce traffic-accidents and death due to air pollution (both major causes of mortality in Iran), leading to an under-estimate of COVID-19 deaths in this approach. On the other hand hospital congestion may increase death due to other conditions, leading to over-estimation of COVID-19 deaths using this method. We do not have data to correct for these potential biases, but overall we suspect that they are a second order effect and the true magnitude of COVID-19 death is captured by the excess mortality data. Additional supporting information may be found in the online version of this article at the publisher's website.Appendix S1: Supporting Information Appendix S2: Supporting Information Simulation-based estimation of the early spread of COVID-19 in Iran 129