key: cord-0985103-bf76vi8z
authors: Sewell, Daniel K.; Miller, Aaron
title: Simulation-free estimation of an individual-based SEIR model for evaluating nonpharmaceutical interventions with an application to COVID-19 in the District of Columbia
date: 2020-11-10
journal: PLoS One
DOI: 10.1371/journal.pone.0241949
sha: 4277150b6cae1b8a1ae8171e4654a4c781a8516f
doc_id: 985103
cord_uid: bf76vi8z

The ongoing COVID-19 pandemic has overwhelmingly demonstrated the need to accurately evaluate the effects of implementing new or altering existing nonpharmaceutical interventions. Since these interventions applied at the societal level cannot be evaluated through traditional experimental means, public health officials and other decision makers must rely on statistical and mathematical epidemiological models. Nonpharmaceutical interventions are typically focused on contacts between members of a population, and yet most epidemiological models rely on homogeneous mixing which has repeatedly been shown to be an unrealistic representation of contact patterns. An alternative approach is individual based models (IBMs), but these are often time intensive and computationally expensive to implement, requiring a high degree of expertise and computational resources. More often, decision makers need to know the effects of potential public policy decisions in a very short time window using limited resources. This paper presents a computation algorithm for an IBM designed to evaluate nonpharmaceutical interventions. By utilizing recursive relationships, our method can quickly compute the expected epidemiological outcomes even for large populations based on any arbitrary contact network. We utilize our methods to evaluate the effects of various mitigation measures in the District of Columbia, USA, at various times and to various degrees. Rcode for our method is provided in the supplementry material, thereby allowing others to utilize our approach for other regions.

In December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was discovered in Wuhan China [1] . The virus causes the Coronavirus Disease 2019 (COVID- 19) , characterized by fever, cough, shortness of breath and other respiratory or flu-like symptoms. Severe cases can lead to pneumonia, respiratory failure, multi-organ dysfunction and death PLOS ONE PLOS ONE | https://doi.org/10.1371/journal.pone.0241949 November 10, 2020 1 / 18 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 [2] [3] [4] . Since its discovery, the virus has rapidly moved across the globe and on March 11th, 2020 the World Health Organization declared COVID-19 to be a global pandemic. To date, over 27.6 million cases and over 900,000 deaths have been reported worldwide [5] . Nearly all countries have identified COVID-19 cases, with 158 countries reporting greater than a thousand cases and 50 countries having recorded over a thousand deaths [5] . As the size of the pandemic continues to grow experts expect COVID-19 will pose a significant threat for many months and potentially years. Thus, the burden not just to population health, but also the overall healthcare system, skilled and long-term care, and the global economy are likely to be substantial. Given the rapid growth with which COVID-19 has moved across the globe, policy makers have sought guidance to slow the spread, reduce the severity of the epidemic, or guide strategies for reopening. Consequently, many infectious disease models have been developed to forecast the trajectory of the current epidemic and to understand the likely impact of a range of interventions [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] . Models designed to capture certain aspects of the epidemic (e.g., forecasting mortality) may not be well suited for others (e.g., evaluating policy decision making), and so decision makers have increasingly had to navigate a range of diverse modeling approaches while attempting to find approaches that can meet the specific nature of a given setting [16] . In short, decision makers need information on the effects of public policy measures on an epidemic that is both timely and accurate. From a modeler's perspective, this translates into a model which is both computationally efficient and which captures salient features of transmission through a population.

Many models have been developed to represent the disease transmission process and evaluate control measures related to COVID-19. Such models typically fall into one of two major categories: homogeneous mixing compartmental models and individual/agent-based models (IBMs). Equation-based models are typically less computationally expensive and can be implemented quicker than IBMs. For example [7] , implemented an early forecast for COVID-19 in Hubei, China, using a mass action compartmental model [17] ; developed a compartmental model with homogeneous mixing which was used to capture the undocumented cases of COVID-19 [18] ; used a compartmental model with homogeneous mixing which allowed for time-varying reproduction number; and two web-based forecasting platforms applied an interactive deterministic compartmental model assuming homogeneous mixing and allow a user to alter a time-dependent intervention [8, 9] . Some approaches have tried to estimate the effect of social distancing measures on COVID-19 such as [19] which uses a stochastic mass action compartmental model and [20] which used an age stratified mass action model. A limitation of equation-based compartmental models is that they rely on the assumption of homogeneous mixing, or mass action.

An alternative to equation-based compartmental models are IBMs, which provide a method for capturing heterogeneous mixing by simulating contacts among individual agents. A wide range of IBMs have been developed for COVID-19 [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] . IBMs have been developed to evaluate numerous interventions including community lockdowns or closures [22, 26, 27, 31, 32, 35, 36, 38, 39] , contact tracing, isolation and quarantine [22-24, 26, 27] , mask wearing [29, 32, 33, 39, 40] , social distancing [22, 24-26, 29, 35] and travel restrictions [26] . IBMs have also been developed to simulate multiple contact settings (e.g., homes, schools, workplaces, public transportation, etc.) [22] [23] [24] 27] and have been applied across a range of populations, cities and countries [22, 24, 26, 27, 32] . However, not all IBMs accurately capture realistic contact networks, such as [34] which assumes that individuals' degrees follow a Poisson distribution. Such a short-tailed degree distribution misses the profound impact that so-called superspreaders have on the outbreak [41] [42] [43] states (p.120),

[superspreaders] transmit infection to many other members of the population, while most infectives do not transmit infections at all or transmit infections to very few others. This suggests that homogeneous mixing at the beginning of an epidemic may not be a good approximation.

These IBMs, however, come at a steep computational cost and can often necessitate a large time commitment as well as a high level of expertise to design and code efficiently.

IBM's are particularly well-suited to evaluate nonpharmaceutical interventions that focus on contacts between individuals. To accurately understand the effects of these types of interventions, we must more accurately model how individuals contact one another. This has been commented on by others; e.g., [44] wrote "realistic mixing can be an important factor to consider in order for the models to provide a reliable assessment of intervention strategies" (p.31). There is a large body of research showing how in many settings homogeneous mixing is inadequate for accurately modeling disease dynamics, such as [43] [44] [45] [46] [47] [48] [49] (see [46] and [48] for further reference listings on this). As a recent concrete example [50] , evaluated model assumptions in the West African Ebola outbreak and stated, "we see that alternative hypotheses for how [Ebola Virus Disease] spreads, such as homogeneous mixing and nearest neighbor interactions, provide quantitatively poorer agreement with data" (p.3). While not a focus of this paper, one important aspect is preferential mixing based on age [51] . While this has been done successfully for age-stratified homogeneous mixing models (see, e.g., [52] ), if a model is to incorporate age when evaluating interventions on contacts, we believe that it would be better to incorporate this into a realistic network model through social selection mechanisms.

One of the advantages of equation-based models over IBMs is the simplicity and speed with which the former can typically be implemented [53] . The computational complexity of IBMs, especially when simulating large populations and complex contact networks, often require extensive programming, algorithmic efficiency, computing power and parallel processing [54] [55] [56] . Conversely, many equation-based models can be quickly implemented using more simplistic numerical computing environments or even basic spreadsheet software [57] . Thus, equation-based models may be utilized more quickly during an outbreak setting of an emerging infectious disease. Indeed, during the COVID-19 pandemic many of the earliest modelling estimates were produced by equation-based models [58] [59] [60] [61] or branching process models (that focused on early stages of an outbreak) [36, [62] [63] [64] . Some of the earliest examples of IBMs applied to COVID-19 involved the re-purposing of models designed for previous outbreaks (e.g., influenza or smallpox) [22, 26, 27, 63] . In the early stages of an outbreak, policy makers may need to quickly evaluate multiple types of complex interventions under a range of unknown parameter values, and some policy actions may more well-suited to IBMs [65] . Thus, there is a need to develop methods that can produce estimates consistent with IBM-based approaches while offering the computational efficiency typically represented by most equation-based models.

The contribution of this paper is primarily to present a computationally efficient estimation method for a network-based IBM which can be used for evaluating nonpharmaceutical interventions such as implementing or lifting social distancing measures or implementing universal personal protective equipment (PPE). Our second contribution is to use this method to provide critical information on the effects of various nonpharmaceutical mitigation measures in the District of Columbia (D.C.) at various times and to various degrees.

Our proposed method for modeling disease transmission dynamics through a susceptibleexposed-infectious-recovered (SEIR) model relies on a contact network rather than mass action assumptions. It focuses on the individuals in the population, but rather than simulating disease over an individual-based model and averaging the results to obtain an estimated epidemic curve, our method directly estimates the probability that a particular individual is infectious at a particular time. Through the use of recursive relationships, the expected number of infected individuals can be efficiently computed at each time point.

The primary benefit of our proposed approach is the ability to obtain the expected epicurve from an SEIR IBM analytically, thereby saving practitioners from having to run thousands of simulations for each possible parameter configuration and counterfactual scenario. Other strengths of the method include the following. First, it relies on a realistic or observed contact network. This is in contrast to assumptions of homogeneous mixing, or that the contact degree distribution is not heavy tailed (e.g., Poisson). This crucially allows us to capture the effects of superspreaders [41, 42] , burstiness of the epidemic (i.e., the tendency for an epidemic to show alternating patterns of slow and rapid growth, as opposed to a steady growth curve) [66, 67] , and other salient features of realistic contact graphs. Second, our approach explicitly captures the way nonpharmaceutical interventions can affect the disease transmission through quarantining/social distancing or reducing the risk that a susceptible-infective contact will lead to a new transmission through, e.g., personal protective equipment (PPE).

Our method is limited in that some disease characteristics are simplified in exchange for more accurate contact patterns and computational efficiency. Specifically, it assumes that everyone who becomes infected experiences a constant latent period and is able to transmit the disease to their set of regular contacts (either directly or through the environment) for the same amount of time. While these quantities can be estimated from the data, it does not reflect the varying lengths of time individuals are susceptible nor the varying lengths of time individuals are infectious before recovering, dying, or being effectively isolated. In addition, it may be that the population is partitioned such that subpopulations have varying levels of infectivity and susceptibility; for example, susceptible individuals of different ages may be more or less easily infected by infectives. While our approach allows (1) contacts patterns to differ between subpopulations and (2) transmission rates to vary over time, we have not accounted for individual level or dyadic level transmission rates.

While our proposed IBM may not be the most realistic model, as IBMs can always be more nuanced, it offers a significant computational benefit. Our proposed approach can be used to compute the expected results from a network-based IBM with large populations using limited computing resources, and our method is easily and quickly deployed (a small R package for implementing this methodology is provided in the S1 File). This computational efficiency allows a user to quickly explore the effects of public policy changes on social distancing or universal PPE interventions, thereby providing decision makers a timely method of evaluating, for example, when and to what degree social distancing measures should be implemented or relaxed.

We begin by presenting the setup and notation we will use. For a population of size N, let the initial probability that an individual is infected be denoted as p 0 . For j = 1, 2, . . ., N, let x tj = 1 if individual j is infective at time t, t = 1, 2, . . ., T, and 0 otherwise. The probability that x tj = 1, Eðx tj Þ, is denoted byx tj . The N individuals in the population are connected through a contact graph which is represented by a N × N adjacency matrix A such that A ij = 1 if i and j can contact each other and 0 otherwise. The probability that a susceptible individual j is infected by an infective neighbor on the contact graph at time t is denoted by p t . If this event occurs, the susceptible enters a latent period of D E days where they have been exposed but are not yet infectious. This latent period is immediately followed by a period of D I days where the individual is infective, and hence ∑ t x tj = D I 8 j. Let I t denote the set of infectives at time t, and let N j denote the neighbors of j on the contact graph. At time t, let Q tj denote the event that susceptible individual j is self-quarantining, and let this event occur with probability q tj . Let ι denote the daily probability of a susceptible individual importing the disease from outside the population of interest. Finally, let H tj denote the event that j is successfully infected at time t (and hence will enter the latent period for D E days), where H 0j corresponds to the event that j is the outbreak's initializer/patient zero. We denote the probability that an event, such as H tj , has occurred by Prob(H tj ), and the probability that it has not occurred by ProbðH 0 tj Þ, and for conditional probabilities we use | to separate the unknown event (left) from the events which we know to have occurred (right).

The goal of the analysis is to estimate the expected number of new infections each day-or equivalently each day's expected cumulative number of infections-according to the individual-based model described above. To achieve this, we first focus on estimating the probability of being infective for each individual in the population on each day, i.e.,x tj .

For the first few days of the outbreak, the only infective(s) will be the outbreak initializer(s). If the latent period is longer than the infectious period then there will be one or more days with zero infectives. If the opposite is true (D E < D I ), then following these first few days there will be a period where the probability an individual is infective equals the probability that they are either the outbreak initializer or were infected within the first t − D E days. After this, the probability that an individual is infective equals the probability that they were not an initializer and were infected within a moving window such that they have passed the latent period but have not yet recovered. To put this concretely in mathematical terms, we have the following.

If D E > D I , then for D I < t � D E ,x

Finally, we have for t > max(D I , D E ),

If individual j is still susceptible at time t, they can become infected by either importing the disease from outside of the study population or by being infected by an infective neighbor on the contact graph. This latter method requires both that j is not quarantined at time t and that at least one infective neighbor infects j. Hence by the law of total expectation we have that

where x t = (x t1 , x t2 , . . ., x tN ) 0 and A � j is the j th column of the adjacency matrix A. Using a first order Taylor's expansion aroundx t , we have

or equivalently

For ease of notation, we let this quantity in (6) be notated as f tj . Combining Eqs (1)-(6) yields the following.

Here α tj is defined to be the probability that individual j is still susceptible by time t. These quantities can be computed recursively in the following manner. First, a 0j ≔ ProbðH 0 0j Þ ¼ 1 À p 0 . Subsequently for t > 0 we have

These recursions then allow us to compute the quantity of interest, namely the number of infections we expect to have by time t. This is derived from the expected number of susceptible individuals:

Eð# infectedÞ ¼ Pop: size À Eð# susceptibleÞ

The parameters to be estimated include the length of the disease's latent period (D E ), the number of days an individual is infective (D I ), the probabilities that a contact leads to a transmission event (p t ), and the probabilities that an individual is quarantined at time t (q t ). The initial probability p 0 may also need to be estimated, but in many cases this will be a known (small) number, such as 1/N for a single initializer. Given a contact graph, prior to social distancing and any intervention which may affect transmission probabilities the quantities p t may be related to the more intuitive reproduction number, or R 0 , defined to be the expected number of new infections generated by a given infective. This can be computed as R 0 ¼ ð#days infectiousÞ � Eð#contacts per dayÞ � ProbðS À I contact leads to new transmissionÞ; and hence

where � d is the average degree in the contact graph A, i.e., the average number of neighbors in A. Estimation can be performed via ordinary least squares (OLS), i.e., minimizing the sum of squared differences between the observed and the expected number of daily cases.

We used publicly available data collected by The COVID Tracking Project [68] for Washington D.C. This data source provides the daily number of lab confirmed cases which we used to train our IBM. We used data from the first case at the beginning of March until the end of May when protests over the killing of George Floyd led to a large number of unexpected potential transmission events.

To obtain estimates of the number of infections, we used results from a study involving a random sample of individuals to test in Indiana [69] which estimated the infection fatality rate (IFR) to be 0.58%. We then used a 30-day rolling window to compute the reporting rate for each day; deaths were shifted by 16 days, as this was the average time between symptom onset and death [70] . That is, for day t, we looked at the reporting rate RR t RR t ¼ P t s¼tÀ 29 #reported cases on day s P tþ16 s¼tÀ 13 #reported deaths on day s À � =IFR These daily reporting rates were then used to rescale the number of reported cases to obtain an estimate of the daily number of infections.

We based our contact graph on a study of inter-personal daily contacts in Hong Kong [71] . In this study, 1,450 individuals were recruited from 857 households, where a contact was defined to be a social encounter which included a "face-to-face conversation or touch (such as handshake, a kiss, games and sports or similar events involving body touch)" [71] . As was consistent with other studies, they found the distribution of contacts to be heavy tailed, which has profound effects in the context of infectious disease as this indicates potential super-spreaders. Overall, individuals had on average 12.5 contacts per day. This matches closely with an ecological momentary assessment study of individuals in upstate New York which found the average number of daily contacts to be 12 [72] . We then constructed a network of 705,749 nodes by randomly assigning each node a number of stubs drawn from a negative binomial distribution with mean 12.5 and dispersion parameter 1.3. Each stub was randomly connected to another node's stub, and in this way we preserved the desired degree distribution.

We estimated individual response to quarantine policy measures and the corresponding mobility patterns using Google Community Mobility Reports [73] . The authors complied with the terms and conditions of https://www.google.com/covid19/mobility/. Google Community Mobility Reports use geolocated mobile phone data to assess various populations' movement patterns. This was measured by first capturing a baseline rate of movement and then evaluating deviations from this baseline rate. These baseline measurements were taken from the period 3 January, 2020-6, February, 2020, and were broken down by day of the week and by category (workplace, retail and recreation, etc.). We chose to focus on the category of retail and recreation, as this seemed to best reflect voluntary movement patterns. We used ordinary least squares (OLS) to find a parameter by which we could multiply the deviations given in the Google Community Mobility Reports in order to obtain a daily estimate of the probability an individual would self-quarantine.

On March 11, 2020 the mayor of D.C. issued a State of Emergency declaration [74] . From the mobility data it is clear that this prompted a sudden shift towards self-isolation (see Fig 1) . In addition, a face mask mandate was issued on April 15 [74] . We trained our model using OLS on the daily number of infections, estimating the effect of the lockdown as well as the effect of the face mask mandate; the latter of these was estimated as a rescaling of the transmission parameter.

We then computed the expected number of daily infections were the declaration of emergency to have occurred one week earlier and one week later than March 11; this was done by shifting the mobility data either backward or forward a week. We also considered the scenario where there was no declaration of emergency. We then evaluated the effect of replacing the lockdown via the declaration of emergency with a face mask mandate for the dates March 4, 11, and 18. We additionally evaluated the effect of supplementing the lockdown with a face mask mandate, again for the same three dates. Since the dates under consideration were so early in the course of the epidemic, we assumed that it would take some time before the same level of mask-wearing compliance as seen on April 15 would be observed. We therefore allowed the transmission rate to change smoothly from pre-mandate to post-mandate levels through a Gaussian kernel such that after ten days the transmission rate was 90% through the transition towards its final value.

We further estimated the effects of reimplementing social distancing for only a strategic portion of the population. This was determined by implementing social distancing for those individuals with the highest number of contacts (e.g., healthcare professionals rotating between long term care facilities) while maintaining no social distancing measures for the remainder of the population. We evaluated the effects of a declaration of emergency on March 11 affecting only the top 10% as measured by number of contacts and for the top 50% both with and without an accompanying face mask mandate. In both cases, those designated to selfquarantine did so at the rate observed from the mobility data (see Fig 1) , while those not designated did not quarantine.

Lastly, we estimated the effect of delaying the lockdown induced by the declaration of emergency by 1, 2, and 3 weeks while implementing a mask mandate. For all lockdown delays, this mask mandate was implemented on March 11 (the day of the observed declaration of emergency), and as before we allowed a gradual rollout of the mask mandate.

By May 28, 2020, there were 453 deaths recorded in D.C. due to COVID-19 infections and 8,492 lab confirmed cases. Fig 2 shows the observed daily and cumulative case counts data alongside the estimated expected number of confirmed cases.

We estimated an initial R 0 of 1.54, an exposed period of 3 days, an infectious period of 6 days, and a daily probability of an individual importing COVID-19 from outside of the population of 8.6 × 10 −5 . The estimated effect of the face mask mandate was a reduction in transmissibility (p t ) of 23%. Fig 3 shows the main results for projecting under various mitigation measures through the date of the face mask mandate. Each of the three subfigures corresponds to a given date of implementing the nonpharmaceutical interventions, March 11, March 4, and March 18, representing the actual date of intervention, the preceding week, and one extra week delay respectively. In each figure for comparison is the expected number of daily infections with no intervention (solid). Also given are curves representing the effects of a declaration of emergency (i.e., lockdown), a mask mandate, and both. Fig 4 shows the results for contexts when only a subpopulation is expected to quarantine. Quarantine rates, i.e., the daily probability that an individual will quarantine, are constant regardless of who is practicing quarantining. Results are shown for the contexts where all are quarantining, and where the top 50% and 10% of the population in terms of number of contacts are practicing quarantining. These strategies are also shown when augmented by a universal mask mandate applied to the entire population. Fig 5 shows the results when the lockdown is delayed by 1, 2, or 3 weeks. As a frame of reference, the curve which corresponds to the actual policies enacted (lockdown beginning on March 11 and no mask mandate). Excepting this reference curve, all projections correspond to a mask mandate enacted on March 11 with a gradual rollout. Quarantining begins on later dates which are marked by the vertical line segments of matching line type.

In an ongoing outbreak, data streams are being continually updated, and estimates of the disease trajectory are highly dynamic. Data and estimates quickly become outdated, and hence it is necessary to be able to provide accurate forecasts and projections of intervention effects rapidly. In other contexts, researchers do not have access to a powerful computing cluster that allows one to perform the massive simulations required for an IBM. Our approach provides an 

Simulation-free estimation of an individual-based SEIR model for evaluating nonpharmaceutical interventions analytically tractable solution to compute the expected epidemic curve and is thus inherently fast and computationally inexpensive compared to simulation-based inference.

Our approach is limited in that in order to obtain credible or prediction intervals, simulation would still be required. However, our work was motivated by the need to evaluate and compare a high volume of scenarios of implementing or ceasing nonpharmaceutical interventions, and our approach provides a fast and computationally inexpensive way to do this compared to simulation-based inference.

While we have focused on static networks, it would be a trivial extension of our work to account for time-varying networks. Eq (6) can be changed to reflect this by replacing A with some time-specific adjacency matrix A (t) , and all other formulas still hold. One important application of this would be to address the fact that some distancing measures affect age strata differently such as school or workplace closures. By replacing the static network by a timevarying one, school closures, for example, could be reflected by having at a certain time t the networks {A (s) } s � t break contacts between most school-age children. In addition, a vaccination pharmaceutical intervention applied at time t could be addressed by breaking all contacts involving vaccinated individuals in the networks {A (s) } s � t . As mentioned previously, our proposed approach cannot handle individual-or dyad-level varying transmission rates, and hence we cannot account for nonpharmaceutical or pharmaceutical interventions that act differentially in this way.

There are several key takeaways from our analysis of COVID-19 in D.C. First, an early lockdown augmented with a face mask mandate has a powerful effect on the expected epidemic trajectory. Adjusting for reporting rate, we estimate that there was a total of 44,590 infections by April 15. Implementing a mask mandate alongside the lockdown on March 11, we estimate we would have expected to have seen a total of 20,616 infections. Had these two interventions been implemented a week earlier we estimate we would have seen a total of 11,703 infections. However, a second takeaway is that this effect is greatly diminished if delayed. Delaying these two interventions by one week would have led to a total of 47,564. Hence an early response was critical to mitigating the extent of COVID-19 spread.

Third, a mask mandate alone would probably not be sufficient. Our results show that there would have been a dramatic increase in infections had a face mask mandate been put in place in lieu of a lockdown. This conclusion comes with a strong caveat, however. Because individuals were wearing masks prior to the mask mandate, it is almost certain that we are underestimating the effect of a mask mandate. That is, the effects of universal mask wearing is underestimated due to the fact that this effect was already partially in place due to voluntary mask wearing, and so the effect of a mask mandate we have estimated is in fact the effect of mandating masks on those who would not voluntarily wear one.

Fourth, if careful consideration were to allow the strategic selection of individuals to quarantine, we could have seen almost the same level of reduction in infections by only implementing the lockdown for the top 50% of individuals as ordered by number of contacts. This result could have important economic impacts, in that only half the population would need to quarantine. We reiterate that this does not mean that half the population is always quarantined, but rather this half of the population is quarantining at the same rate as what we observed from the data for the overall population (see Fig 1) . This strategic selection has a limit. Our results demonstrate that only quarantining the top 10% of individuals would lead to a much worse epidemic trajectory. However, if this strategic 50% quarantining strategy were paired with a mask mandate, we would have expected to see only 25,020 infections, just 56% of the expected number of infections when quarantining all individuals without a mask mandate.

Fifth, by implementing a face mask mandate we can achieve similar or better results when delaying the lockdown by one to two weeks compared to an earlier lockdown without the mask mandate. Given the heavy economic cost of the lockdown [75] , this result has potentially strong consequences. Moreover, there is growing evidence of the clinical and public health related harms that may be attributable to prolonged lockdowns and societal quarantine; examples include delayed medical care [76, 77] , vaccination [78] , cancer screening [79] , increased mental health problems [80] and food insecurity [81, 82] . Thus, there are potentially many other indirect benefits to mask mandates that we do not account for in our model.

One important point is the sensitivity of our results to two factors. First, we made all inference based on a single network generated as described above. We checked and confirmed that there was negligible variation in our results based on using other networks generated according to the same mechanism. Second, we checked how the accuracy of our results fared due to the approximation used in (5) . To do this, we compared the mean epicurve using our proposed method shown in Fig 2a to a Monte Carlo estimate obtained by simulation. We also compared the estimated curves in Fig 3a to Monte Carlo estimates. There was an apparent problem of scale during the peak of the epidemic, but the shape and conclusions drawn from the curves were the same for those from our proposed approach and the Monte Carlo estimates. See the S1 File for these and other sensitivity/accuracy results.

Another important limitation that is worth noting with any study involving the effects of public health policy is the indirect link between behavior changes and policy. It is often impossible to fully disentangle the endogeneity of behavior changes, the size of the epidemic, and the timing public policy, when the counterfactual cannot be observed. It is likely that mask wearing, social distancing and self quarantine are also driven by personal precautions and would increase in absence of public policy mandates. Indeed, there is evidence that many behavior changes that resulted in social distancing preceded policy changes [83] . Thus, our results should be interpreted based on the assumed connection to behavior (e.g., the baseline, no intervention, case may not reflect actual behavior in absence of any policy). 

A novel coronavirus from patients with pneumonia in China

Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study

Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus infected pneumonia in Wuhan, China

Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City area

An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases

Forecasting the impact of the first wave of the COVID-19 pandemic on hospital demand and deaths for the USA and European Economic Area countries

Data-based analysis, modelling and forecasting of the COVID-19 outbreak

COVID-19 Scenarios

Locally Informed Simulation to Predict Hospital Capacity Needs During the COVID-19 Pandemic. Annals of Internal Medicine

Initial simulation of SARS-CoV2 spread and intervention effects in the continental US. medRxiv

Flattening the curve before it flattens us: hospital critical care capacity limits and mortality from novel coronavirus (SARS-CoV2) cases in US counties

Imperial College London: Short-term forecasts of COVID-19 deaths in multiple countries

Los Alamos National Laboratory: COVID-19 Confirmed and Forecasted Case Data

Projections for first-wave COVID-19 deaths across the US using social-distancing measures derived from mobile phones. medRxiv

DELPHI Epidemiological Case Predictions

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2)

Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The Lancet Infectious Diseases

An epidemiological forecast model and software assessing interventions on COVID-19 epidemic in China

Projecting hospital utilization during the COVID-19 outbreaks in the United States

Projecting demand for critical care beds during COVID-19 outbreaks in Canada

Impact of nonpharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand

Modeling the combined effect of digital exposure notification and non-pharmaceutical interventions on the COVID-19 epidemic in Washington state

Modeling the impact of social distancing, testing, contact tracing and household quarantine on second-wave scenarios of the COVID-19 epidemic. medRxiv

Simulating phase transitions and control measures for network epidemics caused by infections with presymptomatic, asymptomatic, and symptomatic stages

Modelling transmission and control of the COVID-19 pandemic in Australia

Modeling layered non-pharmaceutical interventions against SARS-CoV-2 in the United States with Corvid. medRxiv

An agent-based model to evaluate the COVID-19 transmission risks in facilities

How to restart? An agent-based simulation model towards the definition of strategies for COVID-19 "second phase

How many infections of COVID-19 there will be in the "Diamond Princess

How reliable are test numbers for revealing the COVID-19 ground truth and applying interventions? arXiv preprint arXiv

A stochastic agent-based model of the SARS-CoV-2 epidemic in France

Universal masking is urgent in the covid-19 pandemic: SEIR and agent based models, empirical validation

Covasim: An agent-based model of COVID-19 dynamics and interventions. medRxiv

Interventions to mitigate early spread of SARS-CoV-2 in Singapore: A modelling study. The Lancet Infectious Diseases

Effectiveness of isolation, testing, contact tracing and physical distancing on reducing transmission of SARS-CoV-2 in different settings. medRxiv

Revealing COVID-19 transmission in Australia by SARS-CoV-2 genome sequencing and agent-based modeling

An Agent Based Modeling of COVID-19: Validation, Analysis, and Recommendations. medRxiv

COVID-ABS: An agent-based model of COVID-19 epidemic to simulate health and economic effects of social distancing interventions

A modelling framework to assess the likely effectiveness of facemasks in combination with 'lock-down'in managing the COVID-19 pandemic

Super-spreaders in infectious diseases

Using sensor networks to study the effect of peripatetic healthcare workers on the spread of hospital-associated infections

Mathematical epidemiology: Past, present, and future. Infectious Disease Modelling

Influence of non-homogeneous mixing on final epidemic size in a meta-population model

When individual behaviour matters: homogeneous and network models in epidemiology

Mathematical models of contact patterns between age groups for predicting the spread of infectious diseases

An elaboration of theory about preventing outbreaks in homogeneous populations to include heterogeneity or preferential mixing

Recalibrating disease parameters for increasing realism in modeling epidemics in closed settings

Mathematical models to characterize early epidemic growth: A review

Testing modeling assumptions in the West Africa Ebola outbreak

Mixing patterns between age groups in social networks

Influence of demographically-realistic mortality schedules on vaccination strategies in age-structured models

Heterogeneity and network structure in the dynamics of diffusion: Comparing agent-based and differential equation models

A distributed platform for global-scale agent-based models of disease transmission

Indemics: An interactive high-performance computing framework for data-intensive epidemic modeling

Using High Performance Algorithms for the Hybrid Simulation of Disease Dynamics on

Agent-based and population-based simulation: A comparative case study for epidemics

A conceptual model for the outbreak of Coronavirus disease 2019 (COVID-19) in Wuhan, China with individual reaction and governmental action. International journal of infectious diseases

Estimation of the transmission risk of the 2019-nCoV and its implication for public health interventions

Transmission dynamics of the COVID-19 outbreak and effectiveness of government interventions: A data-driven analysis

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. The Lancet Public Health

Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts. The Lancet Global Health

Isolation and contact tracing can tip the scale to containment of COVID-19 in populations with social distancing. medRxiv

Comparing nonpharmaceutical interventions for containing emerging epidemics

How simulation modelling can help reduce the impact of COVID-19

Small but slow world: How network topology and burstiness slow down spreading

Memory and burstiness in dynamic networks

Population point prevalence of SARS-CoV-2 infection based on a statewide random sample-Indiana

High contagiousness and rapid spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerging Infectious Diseases

Temporal variation of human encounters and the number of locations in which they occur: a longitudinal study of Hong Kong residents

Age differences in adults' daily social interactions: An ecological momentary assessment study

Google COVID-19 community mobility reports: Anonymization process description (version 1.0)

COVID-19 US state policy database

The cost of the covid-19 crisis: Lockdowns, macroeconomic expectations, and consumer spending

Delayed access or provision of care in Italy resulting from fear of COVID-19. The Lancet Child & Adolescent Health

Delay or avoidance of medical care because of COVID-19-related concerns-United States

Effects of the COVID-19 pandemic on routine pediatric vaccine ordering and administration -United States

Cancer diagnostic rates during the 2020 'lockdown', due to COVID-19 pandemic, compared with the 2018-2019: An audit study from cellular pathology

The psychological impact of quarantine and how to reduce it: Rapid review of the evidence. The Lancet

Vulnerability to food insecurity since the COVID-19 lockdown

Immediate impact of stay-at-home orders to control COVID-19 transmission on socioeconomic conditions, food insecurity, mental health, and intimate partner violence in Bangladeshi women and their families: An interrupted time series. The Lancet Global Health

Private precaution and public restrictions: What drives social distancing and industry foot traffic in the COVID-19 era?