key: cord-200185-oz2x9a9s authors: Agrawal, Shubhada; Bhandari, Siddharth; Bhattacharjee, Anirban; Deo, Anand; Dixit, Narendra M.; Harsha, Prahladh; Juneja, Sandeep; Kesarwani, Poonam; Swamy, Aditya Krishna; Patil, Preetam; Rathod, Nihesh; Saptharishi, Ramprasad; Shriram, Sharad; Srivastava, Piyush; Sundaresan, Rajesh; Vaidhiyan, Nidhin Koshy; Yasodharan, Sarath title: City-Scale Agent-Based Simulators for the Study of Non-Pharmaceutical Interventions in the Context of the COVID-19 Epidemic date: 2020-08-11 journal: nan DOI: nan sha: doc_id: 200185 cord_uid: oz2x9a9s We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool kit in our march towards digital health. : Timeline of COVID-19 cases, recoveries and fatalities in India taken from [2] . See [2] and [3] for detailed information on how COVID-19 progressed in India. COVID-19 is an ongoing pandemic that began in December 2019. The first case in India was reported on 30 January 2020. The number of cases and fatalities have been on the rise since then. As on 11 August 2020, there are 22,68,675 cases (of which 15,83,489 have recovered) and 45,257 fatalities [1] ; see Figure 1 for a timeline of COVID-19 cases, recoveries and fatalities in India. While medicines/vaccines for treating the disease remained under development at the time of writing this paper, many countries implemented non-pharmaceutical interventions such as testing, tracing, tracking and isolation, and broader approaches such as quarantining of suspected cases, containment zones, social distancing, lockdown, etc. to control the spread of the disease. For instance, the Government of India imposed a nationwide lockdown from 25 March 2020 to 14 April 2020, and subsequently extended it until 31 May 2020 to break the chain of transmission and also to mobilise resources (increase healthcare facilities and streamline procedures). To evaluate various such interventions and decide which route to take to manage the pandemic, epidemiologists resort to models that predict the total number of cases and fatalities in both the immediate and the distant futures. The models used should have enough features to enable the evaluation of the impact of various kinds of non-pharmaceutical interventions. Broadly three kinds of models have been used to study this epidemic. The first set of models takes a curve-fitting approach. They rely on simple parametric function classes. The parameters of the model are fit via regression to match observed trends. The second set of models addresses the physical dynamics of the spread at a macroscopic level. These are meanfield ordinary differential equations (ODEs) based compartmental models (e.g. Susceptible-Exposed-Infected-Recovered (SEIR) model and its extensions) based on the classical work of Kermack and McKendrick [4] . Here the population is divided into various compartments such as susceptible, exposed, infected, recovered, etc., based on the characteristics of the epidemic. One then solves a system of ODEs that captures the evolution of the epidemic at a macroscopic scale 1 . Localised versions of these are spatio-temporal mean-field models that lead to partial differential equations 2 . The third set of models, and the focus of this work, are agent-based models 3 . A very detailed model of the society under consideration, with as many agents as the population, is constructed using census and other data. The agents interact in various interaction spaces such as households, schools, workplaces, marketplaces, transport spaces, etc. See Figure 2 for a schematic representation of an agent-based model with the aforementioned interaction spaces. These interaction spaces are the primary contexts for the spread of infection. A susceptible individual can potentially get infected from an interaction in one of these spaces upon contact with an infected individual. Once an individual is exposed to the virus, this person goes through various stages of the disease, may infect others, and eventually, either recovers or dies. Other models work at an intermediate level by modelling the social network of interactions, e.g., [12] , but we shall focus more on agent based models. There are several advantages of using agent-based models. First, since modelling is performed at a microscopic level unlike the macroscopic level in compartmental models, agentbased models are well suited to capture heterogeneity at various levels. For instance, the agedependent progression of COVID-19 in individuals (severity, the need for hospital care, intensive care, etc.) can be incorporated in agent-based models. Second, individual behavioural changes, known to be important in certain diseases such as AIDS, can be easily modelled. Third, agent-based models are well suited to study the impact of various non-pharmaceutical interventions, such as "lockdown for a certain number of days", "offices operating using the 1 See [5] for a state-level epidemiological model for India and [6] for a combination of the two approaches. 2 For a paper in the Indian context see [7] . 3 There are other agent-based simulators that have informed policy decisions. See [8] for UK and USA related studies specific to COVID-19, see [9] for a COVID-19 study on Sweden, see [10] and references therein for many agent-based models and their comparisons, and see [11] for a taxonomy of agent-based models. (so-called) odd-even strategy", "social distancing of the elderly", "voluntary home quarantine", "closure of schools and colleges", etc. Explicit modelling of these contexts of infection spread also enables studies of control measures targeting the interaction spaces. Fourth, there is an important difference between the actual infected number in the population, which is what the differential equations-based models predict, and the reported cases. The latter is invariably based on those that come to hospitals/clinics seeking health care, or those that are identified due to random testing, followed by contact tracing of such index cases. As a consequence, reported cases provide a biased estimate of the actual infected number in the population. Agent-based simulators have the capability to track such biased estimates of prevalence. In this work, we describe our city-scale agent-based simulator to study the epidemic spread in two Indian cities and demonstrate how digital computational capabilities can help us assess the impact of various interventions and manage a pandemic. We now provide sample outcomes for Bengaluru and Mumbai for COVID-19 under various interventions. These outcomes have been generated using our city-scale agent-based simulator. Bengaluru and Mumbai have estimated populations of 1.23 and 1.24 crore people respec- 4 The 2011 census figure for Bengaluru is 0.85 crore and for Mumbai is 1.24 crore (Mumbai city only, not the Mumbai Urban Area whose 2011 census estimate is 1.84 crore). Bengaluru's 2020 population is estimated to be 1.23 crore. Reliable data is not available for Mumbai city's 2020 population. We have used 2020 estimated population for Bengaluru and 2011 census estimate for Mumbai. social distancing of the elderly, closure of schools and colleges, 50% occupancy at workplaces, and case isolation. This is the fourth shaded region in the plots. • From 18 May 2020 onwards, continued contact tracing (following the Indian Council of Medical Research (ICMR) guidelines as much as possible) and associated quarantining and case isolations, but otherwise an unlocked Bengaluru. Soft ward containment continues to be in force. By soft ward containment, see Figure 4 , we mean linearlyvarying mobility control that turns an open ward into a locked ward when the number of hospitalised cases become 0.1% of the ward's population; in the latter locked scenario, only 25% mobility is allowed for essential services. • Past studies [13] - [16] have indicated that masks have been effective in reducing the spread of influenza. Anecdotal evidence seems to suggest that masks are effective for COVID-19. The Ministry of Home Affairs (MHA) order of 15 April 2020 [17, Annexure 1] made the wearing of masks in public places compulsory. This was reemphasised in the MHA order of 30 May 2020 [18] . We assume that masks are mandatory from 09 April 2020 onwards. • It is often the case that when there are several restrictions in place, only a fraction of the population complies with these restrictions. Getting the entire population to comply is often a big challenge and requires significant and persistent messaging (including communication, rewards, punitive measures). We assume a compliance factor of 0.7 up to 04 May 2020, which means that 70 percent of the population adheres to the government guidelines like social distancing, wearing masks in public places, etc., and 0.6 thereafter. The reduction could be attributed to behavioural changes due to lockdown fatigue. • A brief lockdown during 14-21 July 2020 was implemented in Bengaluru. We compare two scenarios, one with this lockdown and one without this lockdown. As one can anticipate, simulation of the above scenarios requires a significant level of sophistication in the modelling and implementation. We describe how we do these in the coming sections, but now focus only on the outcomes. The trend for the reported cases is roughly captured, but fatalities are over-predicted. This is surprising since the reported cases continued to be high in the third week of July. For a more detailed study of these plots, we refer the reader to Section III-A. At this stage, we only observe that the public health benefit of the lockdown is clear from the pictures, reduced peak at the expense of a brief second wave. Armed with these predicted outcomes under the two scenarios, public health officials can now weigh the benefits of the lockdown against its economic consequences. 2) Mumbai: For Mumbai, we simulate the following scenario. • Workplaces open with a small strength of 5% during 18-31 May 2020, as per Government of Maharashtra directions. This is the fifth shaded region. During this period, social distancing of the elderly and school and college closures remain in force. • Workplace strengths increase to 20% in June, to 33% in July, and to 50% in August, with commensurate capacity increases in the local trains. Social distancing of the elderly and school and college closures remain in force. In addition voluntary home quarantine and case isolation come into play. • Throughout the simulations, soft ward containment is in force. • It is often difficult to comply with social distancing directives in high population density areas, like in slums, with many common essential facilities. In Mumbai, we model compliance to be 0.4 in high density areas and 0.6 in other areas. • Throughout the simulations, contact tracing, associated quarantining, testing, and further tracing are enabled. • We will compare the above scenario, with local trains enabled, and will contrast it with another hypothetical scenario having no local trains. is our hope that such tools become common place in a city administration's tool kit, and are used to the fullest extent before drastic interventions with wide-scale impact, e.g., lockdown, are imposed. With additional modelling of activity, mobility, and behaviour, and use of high quality data on the migrant labour force in urban areas, we speculate that we could have anticipated certain behavioural outcomes seen in India after the lockdown announcement (e.g., migrant population movement). Broadly, the steps involved in agent-based modelling are the following: build the simulator, calibrate it, validate it, and use it for estimating how the pandemic will evolve. 1. Simulator. The simulator itself consists of four parts. Synthetic city. A synthetic city generator builds a synthetic city with individuals and various interaction spaces. Individuals are assigned to various interaction spaces such as households, schools/workplaces, communities and transport spaces. In doing this we capture the demographics of the city, the school size distributions, the workplace size distributions, the commute distances, the neighbourhood and friends' interaction networks, the transport interaction spaces, etc. These fix the "social networks" on which individuals interact and transmit the virus. Table I for some examples. Many of these involve reduction in changes in contact rates as a consequence of the interventions. The values to set could be based on observed mobility patterns. For example, according to the COVID-19 Community Mobility Report for India in April [19] in Table II, prepared by Google based on data from Google Account users who have "opted-in" to location history, there was significant reduction in mobility during the lockdown period compared to the baseline period of 03 January 2020 to 06 February 2020. This informs the nominal contact rate choices in the interventions' definitions in Table I and later in other Tables. 2. Calibration. Once the simulator is ready there are still unknown parameters that need to be identified. These include the contact rates at various interaction spaces, the number of infections to seed, the time at which these infections should be seeded, the compliance parameters, etc. The purpose of the calibration step is to identify these parameters to capture the city specific trends and contact rates. We do this by choosing the initial number to seed, the time at which these are seeded, and the contact rates so that the initial trend of the disease is matched. Once calibrated, we can run our simulator for a certain number of days and understand how the epidemic spreads. 3. Validation. We next have to validate our simulator, so that we can understand the predictive power of the simulator. For this, we look for phenomena in the real data that have not been explicitly modelled and we check if the simulator is able to capture these phenomena. For specific details, see Section IV. 4. Use of the simulator in an evolving pandemic. It is often the case that in evolving pandemics, predictions do not match reality as time unfolds. Models are often gross oversimplifications of the underlying complex reality and assumptions are often wrong or may need updating as the pandemic evolves. The purpose of models in an evolving pandemic is not merely to predict numbers, in which task they will likely fail, but more to enable principled decision making on intervention strategies. They enable a study of the public health outcomes of one strategy versus another. Armed with these comparisons, public health officials can make more informed decisions. Needless to say, these are often more complex and involve several aspects beyond just public health, e.g. economy, psychology, education, political climate, to name a few 5 . 5 For a proposal on how to simulate economic and public-health aspects together, see [20] . One of the powerful features of the agent-based simulator is its ability to explicitly control various interaction spaces and study the outcomes. We demonstrate this feature via the case studies for Bengaluru and Mumbai listed in Table III . We compare the following three scenarios in Bengaluru: • No intervention other than contact tracing, testing and associated case isolation. • Indefinite lockdown starting from 14 March 2020 onwards. This naturally will have enormous economic and societal cost, but we focus only on the direct COVID-19 public health outcomes. • Scenario-2 in Table IV: soft ward containment, case isolation with testing and contact tracing, and a one-week lockdown during 14-21 July 2020. We assume a compliance of 70% until 03 May 2020 (i.e. during the initial Karnataka-wide lockdown followed by the nation-wide lockdown) and a compliance of 60% starting 04 May 2020, for all these scenarios. That is, 70% (resp. 60%) of the population comply with the restrictions in place until 03 May 2020 (resp. starting 04 May 2020). Under these scenarios, we plot the following: daily cases (Figure 9 ), daily fatalities ( Figure 10 ), cumulative cases ( Figure 11 ), cumulative fatalities ( Figure 12 ) and estimated hospital beds and critical care beds ( Figure 13 ). We make the following observations. • As one would expect, the least number of cases, fatalities and hospital beds requirements correspond to the "indefinite lockdown" scenario. However this scenario has serious impact on the economy, livelihoods, etc. • In terms of the daily number of cases, the no intervention scenario had a peak around 01 June 2020 (with roughly 15,000 cases), whereas the present scenario in Bengaluru (i.e. Scenario-2 in Table IV ) had a much lower peak around 15 July 2020 (with around 2000 cases), followed by another peak around end of August. Similar trends can be seen in the fatalities estimates as well as the hospital bed estimates. Our health care system would have struggled with the no intervention scenario, and the present scenario in Bengaluru helped mitigate and delay the peak of the epidemic. • The second predicted peak in Scenario-2 in Table IV is due to the one-week lockdown during 14-21 July 2020. • Towards the end of July, we overpredict the number of daily fatalities and underpredict the number of daily cases. This could be because of two reasons: 1) The number of tests has increased significantly during mid-July due to which there is a likely surge in the number of asymptomatic cases. As a consequence, a reduction in the number of daily cases due to the one-week lockdown during 14-21 July is not observed in the reported number of daily cases; such a reduction is visible in our estimates because the testing regime is assumed constant through the period in our simulator. 2) There is a delay in reporting the fatalities. As the reported number of daily cases follow an exponential trend during early-mid July, one would expect a similar trend in the reported daily fatalities during end-July, as shown in our prediction of the daily fatalities under Scenario-2. However, we see a reduction in the reported number of daily fatalities during after 15 July 2020. This could be due to a possible delay in reporting the daily fatalities, or an effective use of the rapid point-of-care antigen test kits, or a combination of both. Testing of these hypotheses require further investigation. B. Case Study B: Impact of opening offices at 50% capacity with higher compliance versus lockdown at lower compliance The degree of compliance among the population to public health directions/guidelines is an important factor that affects the epidemic. To understand the importance of compliance, we compare the following scenarios for Bengaluru: the present Bengaluru (i.e. Scenario-2 in Table IV ), an unlocked Bengaluru (i.e. Scenario-1 in Table IV) , and an unlocked Bengaluru with a higher compliance of 90% starting 04 May 2020 (i.e. Scenario-1 in Table IV with 70% compliance during 14 March 2020 -03 May 2020 and 90% compliance starting 04 May 2020). As before, we plot the following: daily cases (Figure 14 ), daily fatalities ( Figure 15 ), cumulative cases ( Figure 16 ), cumulative fatalities ( Figure 17 ) and estimated hospital beds and critical care beds ( Figure 18 ). We make two important observations: • In terms of the number of cases and fatalities, the present Bengaluru (i.e. Scenario-2 in the general populace on the public health impact of their actions, to induce more prosocial behaviour, and to ensure greater compliance. This was the approach taken by Sweden, a country with a population of about 1 crore. • Comparing Scenario-1 and Scenario-2, we see that the effect of the one-week lockdown during 14-21 July is very minimal in the long term as far as the cumulative number of cases and fatalities are concerned. However, there is a significant difference in the cumulative number of cases and fatalities between Scenario-2 and Scenario-1 with a higher compliance of 90% starting 04 May 2020. This suggests that, given that vaccines for COVID- 19 are not yet available, short-term lockdowns' benefit is restricted to mobilising resources and preparing the healthcare system in the short term. On the other hand, higher compliance has a greater impact in reducing cases and fatalities. • Trains-OFF: Suburban trains are not operational throughout. As indicated in Table V , we assume a compliance factor of 60% in non-slums and 40% in slums. We plot our results in Figures 19-23 . • From the plots, we see that the phased opening of suburban trains starting 01 June 2020 gives a marginal increase in the number of cases, fatalities and hospital beds compared to the Trains-OFF scenario. This suggests that trains can be operated with enforcement • Although we match the daily fatalities 7 curve very well, we over-predict the daily number of cases. We believe that this is due to the limitation on the testing capacity on the ground. Because of this, the test results of many people arrive late and cases get reported with a certain delay. It is also worth mentioning that, although we overpredict the daily number of cases, we correctly capture the growth rate of the daily number of cases as well as the cumulative number of cases. We study the impact of two containment strategies for Bengaluru: soft ward containment (i.e., linearly-varying mobility control that turns an open ward into a locked ward when the number of hospitalised cases become 0.1% of the wards population; in the latter locked scenario, only 25% mobility is allowed for essential services, see Figure 4 ) and neighbourhood containment (i.e., when an individual is hospitalised, everyone living in a 100m surrounding area undergoes home quarantine). Soft ward containment is a more feasible strategy than strict ward containment since the average ward population in Bengaluru is about 62,000. As 7 We use a corrected version of the reported number of daily fatalities from Brihanmumbai Municipal Corporation (BMC). The initial reported daily fatalities curve from BMC had a very large peak at 16 June 2020. The corrected data adjusts the daily fatalities curve until 15 June 2020 so that the peak on 16 June 2020 gets re-distributed to the previous days in a suitable way. the number of hospitalised cases in the ward increases, more public health wardens could be deployed and help reduce mobility and interaction in the ward. In Figures 24-28 , we plot these two scenarios. We observe that neighbourhood containment is more effective than soft ward containment, in terms of cases and fatalities. To compare various levels of strictness with which policies are enforced, we now consider the opening scenario indicated in Table V We now study the impact of opening schools. In Figures 34-38 , we compare the following two scenarios: • Schools-closed: The present scenario in Bengaluru, i.e., Scenario-2 in Table IV, • Schools-open: Scenario-2 in Table IV with schools open from 01 September 2020. As expected, both these scenarios follow the same trend until about mid-September, after which the disease spread increases in the latter. Around early November, we observe a between 10-15% increase in the cumulative number of cases and the cumulative number The first step in our agent-based model is to model a synthetic city that respects the demographics of the city that we want to study. Our city generator uses the following data as input: • Geo-spatial data that provides information on the wards of a city (components) along with boundaries. (If this is not available, one could feed in ward centre locations and ward areas). • Population in each ward, with break up on those living in high density and low density areas. • Age distribution in the population. • Household size distribution (in high and low density areas) and some information on the age composition of the houses (e.g., generation gaps, etc.) • The number of employed individuals in the city. • Distribution of the number of students in schools and colleges. • Distribution of the workplace sizes. • Distribution of commute distances. • Origin-destination densities that quantify movement patterns within the city. Taking the above data into account, individuals, households, workplaces, schools, transport spaces, and community spaces are instantiated. Individuals are then assigned to households, workplaces or schools, transport and community spaces, see Figure 2 for a schematic representation. The algorithms for the assignments do a coarse matching. The matching may be refined as better data becomes available. The interaction spaces -households, workplaces or schools, transport and community spaces -reflect different social networks and transmission happens along their edges. There is interaction among these graphs because the nodes are common across the graphs, see Figure 39 for various interaction spaces and Figure 40 for a bipartite graph abstractions of these interaction spaces. An individual of school-going age who is exposed to the infection at school may expose others at home. This reflects an interaction between the school graph and the household graph. Similarly other graphs interact. We now describe how individuals are assigned to interaction spaces. The households are then assigned to wards so that the total number of individuals in the ward is in proportion to population density in the ward, taken from census data. A population density map is given in Figure 41 In past works, given the structure of educational institutions elsewhere, educational institutions have been divided into primary schools, secondary schools, higher secondary schools, and universities. The norm in Indian urban areas is that schools handle primary to higher secondary students and then colleges handle undergraduates. We view all such entities as schools. We assign students to schools on a ward-by-ward basis. In each ward, we have a certain number of students. We pick a school size from a given school size distribution and instantiate 8 The unemployed fraction in Bengaluru, from the census data, is just over 50%, even after taking into account employment in the unorganised sector. Similar is the case with Mumbai. This may have some bearing on the epidemic spread. a school of this size and place it randomly in that ward. Students who live in that ward are picked randomly and assigned to this school until that school is filled to its capacity. We repeat this procedure until all students in that ward gets assigned to a school, and then we repeat this procedure for all wards. This procedure could lead to at most one school per ward whose capacity is more than its sampled capacity. Assignment of workplaces: Workplace interactions can enable the spread of an epidemic. In principle, Bengaluru's and Mumbai's land-use data could be used to locate office spaces. The assignment of individuals to workplaces is done in two steps. In the first step, for each individual who goes to work, we decide the ward where his/her office is located. This assignment of a "working ward" is based on an Origin-Destination (OD) matrix. An OD matrix is a square matrix whose number of rows equals the number of wards, and its (i, j)th entry tells us the fraction of people who travel from ward i to ward j for work. In the second step, for each ward, we sample a workplace size from a workplace size distribution, create a workplace of this capacity and place it uniformly-at-random in that ward. We then randomly assign individuals who work in that ward to this workplace. Similar to assignment of schools, we continue to create workplaces in this ward until every individual working in that ward gets assigned to a workplace, and we repeat this procedure for all wards. For Bengaluru, the OD matrix is obtained from the regional travel model used for Bengaluru. For Mumbai, based on the "zone to zone" travel data from [21] , an origin-destination matrix was extrapolated based on the population of each ward. The above assignments could be improved further in later versions of this simulator. Community spaces: Community spaces include day care centres, clinics, hospitals, shops, markets, banks, movie halls, marriage halls, malls, eateries, food messes, dining areas and restaurants, public transit entities like bus stops, metro stops, bus terminals, train stations, airports, etc. While we hope to return to model a few of the important ones explicitly at a later time, we proceed along the route taken by [22] with two modifications. In our current implementation, each individual sees one community that is personalised to the individual's location and age and one transport space personalised to the individual's commute distance. For ease of implementation, the personalisation of the community space is based on ward-level common local communities and a distance-kernel based weighting. The personalisation of the transport space is based on commute distance. Details are given in Section IV-C. Age-stratified interaction: The interactions across these communities could be age-stratified. This may be informed by social networks studies, for e.g., as in [23] which has been used in a recent compartmentalised SEIR model [24] . Smaller subnetworks: We create smaller subnetworks in workplaces, schools and communities, and associate certain number people to these smaller networks with the interpretation that people in a smaller subnetwork have high contact rate among them compared to the others. In some more detail, we create "project" networks at each workplace consisting of people in that workplace having closer interaction, a "class" network in each school consisting of students of the same age, a random community network among people in a given ward to model daily random interactions, and a neighbourhood subnetwork among people living in a 178m×178m square 9 . These subnetworks are later used for identifying and testing/quarantining individuals based on a contact tracing protocol. The output of all the above is our synthetic city on which infection spreads. Figure 42 provides an indication of how close our synthetic city is to the true city in terms of the indicated statistics. We have used a simplified model of COVID-19 progression, based on descriptions in [25] and [8] . This will need updating as we get India specific data. An individual may have one of the following states, see Figure 43 : susceptible, exposed, infective (pre-symptomatic or asymptomatic), recovered, symptomatic, hospitalised, critical, or deceased. We assume that initially the entire population is susceptible to the infection. Let τ denote the time at which an individual is exposed to the virus, see Figure 43 . The incubation period is random with the Gamma distribution of shape 2 and scale 2.29; the mean incubation period is then 4.58 days (4.6 days in [8] and 4.58 in [26] ). Individuals are infectious for an exponentially distributed period of mean duration 0.5 of a day. This covers both presymptomatic transmission and possible asymptomatic transmission. We assume that a third of the patients recover, these are the asymptomatic patients; the remaining two-third develop symptoms. Estimates of the number of asymptomatic patients vary from 0.2 to 0.6. Though we have explored other asymptomatic fractions, we restrict attention here to 1/3. Symptomatic patients are assumed to be 1.5 times more infectious during the symptomatic period than during the pre-symptomatic but infective stage. Individuals either recover or move to the hospital after a random duration that is exponentially distributed with a mean of 5 days 10 . The probability that an individual recovers depends on the individual's age 11 . It is also assumed that recovered individuals are no longer infective nor susceptible to a second infection. While hospitalised individuals may continue to be infectious, they are assumed to be sufficiently isolated, and hence do not further contribute to the spread of the infection. Further progression of hospitalised individuals to critical care is mainly for assessing the need for hospital beds, intensive care unit (ICU) beds, critical care equipments, etc. This will need to be adapted to our local hospital protocol. Let us reiterate. Once a susceptible individual has been exposed, the trajectory in Figure 43 takes over for that individual. Further progressions are (in our current implementation) only based on the agent's age. At each time t, an infection rate λ n (t) is computed for each individual n based on the prevailing conditions. In the time duration ∆t following time t, each susceptible individual moves to the exposed state with probability 1 − exp{−λ n (t) · ∆t}, independently of all other events. Other transitions are as per the disease progression described earlier. Time is then updated to t + ∆t, the conditions are then updated to reflect the new exposures, changes to infectiousness, hospitalisations, recoveries, contact tracing, quarantines, tests, test outcomes, etc., during the period t to t + ∆t. The process outlined at the beginning of this paragraph is repeated until the end of the simulation. ∆t was taken to be 6 hours in our simulator and is configurable. Additionally, each individual has two other parameters: a severity variable C n and a relative infectiousness variable ρ n , see [22] . Both bring in heterogeneity to the model. Severity C n = 1 if the individual suffers from a severe infection and C n = 0 otherwise; this is sampled at 50% probability independently of all other events. Infectiousness ρ n is a random variable that is Gamma distributed with shape 0.25 and scale 4 (so the mean is 1). The severity variable captures severity-related absenteeism at school/workplace, associated decrease of infection spread at school/workplace, and the increase of infection spread at home. If the individual gets exposed at time τ n , a relative infection-stage-related infectiousness is taken to be κ(t − τ n ) at time t. For the disease progression described in the previous section, this is 1 in the presymptomatic and asymptomatic stages, 1.5 in the symptomatic, hospitalised, and critical stages, and 0 in the other stages. To describe the infection spread at transport spaces, let T (n) = 1 if agent n uses public transport and let T (n) = 0 otherwise. Let A n,t = 0 if at time t agent n is either (i) compliant and under quarantine, (ii) hospitalised, (iii) critical, or (iv) dead, and let A n,t = 1 if none of the above is true and agent n attends office at time t. We model the effectiveness of masks by reducing the ability of an infectious individual to transmit the infection by 20%, if a mask is worn (see [13] - [16] ); let M n = 0.8 if agent n wears a face mask in public transport and M n = 1 otherwise. Let β h , β s , β w , β T , β c , β * h , β * s , β * w and β * c denote the transmission coefficients at home, school, workplace, transport, community spaces, neighbourhood network, class network, project network and random community network, respectively. These can be viewed as scaled contact rates with members in the household, school, workplace, community, neighbourhood, class, project and random community, respectively. More precisely, these are the expected number of eventful (infection spreading) contact opportunities in each of these interaction spaces. It accounts for the combined effect of frequency of meetings and the probability of infection spread during each meeting. For a susceptible individual, the rate of transmission is governed by the sum of product of contact rate β and infectiousness in all the interactions spaces. To model infectiousness, we consider three scenarios. Interactions without age-stratification: This is the simplest model where interactions within each network is assumed to be homogeneous. A susceptible individual n (who belongs to home h(n), school s(n), workplace w(n), transport space T (n), and community space c(n)) sees the following infection rate at time t: where h c,c (t) = n :c(n )=c f (d n ,c(n ) ) · ζ(a n ) · I n (t)β c r c κ(t − τ n )ρ n (1 + C n (ω − 1)) The expression (1) can be viewed as the rate at which the susceptible individual n contracts the infection at time t. Each of the components on the right-hand side indicates the rate from home, school, workplace, transport space, and community. The additional quantities, over and above what we have already described, are as follows. The parameter α determines how household transmission rate scales with household size, a crowding-at-household factor. It increases the propensity to spread the infection by a factor n 1−α . We have taken α = 0.8, see [22] . A common parameter ω indicates how a severely infected person affects a susceptible one, as will be clear from below. (This is to be tuned at a later stage and is set to 2 now). The functions ψ s (·) and ψ w (·) account for absenteeism in case of a severe infection. It can be time-varying and can depend on school or workplace. We take ψ s (t) = 0.1 and ψ w (t) = 0.5 while infective and after one day since infectiousness. School-goers with severe infection contribute lesser to the infection spread, due to higher absenteeism, than those that go to workplaces; moreover, the absenteeism results in an increased spreading rate at home. The function ζ(a) is the relative travel-related contact rate of an individual aged a. We The function f (·) is a distance kernel that can be matched to the travel patterns in the city. Finally, our choice of the infection rate from the community space is a little different from the rate specified in [22] , in order to enable an efficient implementation. When the distance kernel is f (d) = 1/(1 + (d/a) b ) and d a, i.e., the wards are small, then our specification is close to that indicated in [22] . We take a = 10.751 km and b = 5.384, based on a fit on data for Bengaluru. As one can see from (1), we have one community space but with contributions from various wards. This enables inclusion of 'containment zones' and the associated restriction of interaction across such zones, as we shall soon describe. Age-stratified interactions: If this is enabled, the home, school, workplace and community interaction rates have an extra factor M h n,n , M s n,n , and M w n,n in the summand which accounts for age-stratified interactions. Each of these depends on n and n only through the ages of agents n and n . The resulting contact rate for individual n at time t is then: where h c,c (t) is given in (2) . Computational complexity can be reduced by focusing only on the principal components of M h , M s , and M w . Interactions with smaller subnetworks: In this situation, we have additional contact rate parameters, one for each smaller subnetwork: let β * h , β * s , β * w and β * c denote the transmission coefficients at neighbourhood network, class network, project network and random community network respectively. Then, an agent n (who belongs to neighbourhood network H (n), class S (n), project W (n) and random community C (n), in addition to home h(n), school s(n), workplace w(n), transport space T (n), and community space c(n)) sees the following infection rate at time t: λ n (t) = n :h(n )=h(n) 1 n α h(n) · I n (t)β h κ(t − τ n )ρ n (1 + C n (ω − 1)) + ζ(a n ) n :H (n )=H (n) 1 n H (n) · ζ(a n )I n (t)β * h κ(t − τ n )ρ n (1 + C n (ω − 1)) (larger neighbourhood interaction) + ζ(a n )f (d n,c(n) ) n :C (n )=C (n) f (d n ,c(n ) ) × n :C (n )=C (n) f (d n ,c(n ) )ζ(a n )I n (t)β * c κ(t − τ n )ρ n (1 + C n (ω − 1)) (close friends' circle interaction) (project network interaction) where h c,c (t) is given in (2) . The subnetwork interactions are stronger contexts for disease spread. Contact tracing targets exactly these subnetworks for additional testing, case isolation or quarantine. Two methods of seeding the infection have been implemented. • A small number of individuals can be set to either exposed, presymptomatic/asymptomatic, or symptomatic states, at time t = 0, to seed the infection. This can be done randomly based either on ward-level probabilities, which could be input to the simulator, or it can be done uniformly at random across all wards in the city. • A seeding file indicates the average number of individuals who should be seeded on each day in the first stage of infectiousness (presymptomatic or asymptomatic). This could be done based on data for patients with a foreign travel history who eventually visited a hospital. A certain multiplication factor then accounts for the asymptomatic and the symptomatic individuals that recover without the need to visit the hospital. The seeding is done at a random time earlier in the time line, based on the disease progression. We calibrate our model by tuning the transmission coefficients at various interaction spaces under the no-intervention scenario in order to match the cumulative fatalities to a target curve. We assume a common upscaling factorβ for the transmission coefficients of smaller subnetworks, i.e., we set β * w =ββ w , β * s =ββ s and β * h = β * c =ββ c . We assume that β = 9, indicating that the subnetworks account for 90% of the overall contacts. The following heuristic iterative algorithm inspired by stochastic approximation is then used to identify the best choice of the free parameters. where [exp(m * − m(n))] 1/a a = min{max{exp(m * − m(n)), a}, 1/a}, Λ h (n) (resp. Λ w (n), Λ c (n)) is the fraction of infections from home (resp. workplace, community) in the nth where Λ * h = Λ * w = Λ * c = 1/3, and m * is the target slope (the target slope is similarly computed from the cumulative fatalities data in log scale; for example, the India fatalities curve in the range 130-199 gives a slope of m * = 0.1803). Once the slopes are matched, assuming that the simulator starts on 01 March 2020, we find the delay between the fatalities curve from the simulator and the target data. We then use the resulting contact rates and the above calibration delay to launch our simulations. To avoid any oscillatory behaviour of the calibration algorithm, we also set the scale factor in each of the above update steps to be [exp((m * − m(n))/n)] We do not calibrate β T , the transmission coefficient at transport space. For the calibration step we take this parameter to be zero while tuning the other parameters. A heuristic justification is as follows. Bengaluru travel interactions will likely be captured through the local community interactions, and we keep it zero throughout, even in the case studies. For Mumbai however, local trains are a key mode of daily transportation with a population of the order of 75 lakh travelling daily using this mode in normal times. However, trains were stopped in Mumbai prior to the national lockdown and were running below capacity for at least a week before that. Moreover, the initial infections were seeded by travellers that came from abroad. The primary mode of travel for this group is unlikely to be rail transport. So we disabled the transport space while calibrating by setting β T = 0. Subsequently for the trains on/off case study, we used a heuristic calculation of β T ; see [28, Section IV]. The above procedures identify the contact-related parameters. Other parameters are the distance kernel parameters, the parameter α that accounts for crowding in households, the age-stratified interactions, the distribution parameters for individual infectiousness, the probability of severity, etc. These are set as follows: The simulator has the capability to accommodate interventions and compliance. Table VIII describes some of the interventions in [8] , some adapted to suit our demographics, and some new interventions involving the nation-wide 40-day 'lockdown' in India and various scenarios of 'unlock'. These are fairly straightforward to implement -we modulate an individual's contact rate with an interaction space (both into the interaction space and out of the interaction space) by a suitable factor associated with intervention. For example, one could easily implement and study cyclic exit strategies as done in [29] . The triggers for cyclic controls could be based on signals such as the number of individuals that are hospitalised, as done in our soft ward containment. Yet another one is to quarantine or case isolate based on contact tracing, as we will describe next. Our simulator also includes a framework to study the impact of early contact tracing and testing. We assume that contacts of an individual in the smaller networks such as neighbourhood network, project network, class network and random community network can be identified and tested/quarantined. The current contact tracing protocol quarantines certain primary contacts and tests a subset of these (e.g., symptomatic primary contacts). In our implementation, based on our study of ICMR's testing protocol, given an index case, all household members, a fraction of the friends circle, a fraction of the inner school/workplace circle, and a fraction of the neighbourhood community are termed as primary contacts of this index case. All of these are quarantined, and a fraction of the symptomatic and another fraction of the asymptomatic among these are tested. Those who test positive become new index cases and spawn further contact tracing. The testing fractions are calibrated to match the actual reported cases and the test-positivity rate. We list some limitations of our simulator. • We do not have activity modelling in our simulator. As a consequence, weekly and daily patterns on interactions are not taken into account; for instance, the absence of interaction in workplaces and schools during weekends/public holidays, an increased interaction in public transport during morning and evening peak hours etc. are not taken into account in our model. Instead, all these factors are abstracted into a single infection rate for each individual prescribed by (1), (3) and (4). • Some of the data that we need in our simulator, such as the household size distribution, workplace size distribution, school size distribution, commuter distance distribution etc., can perhaps be difficult to obtain for some cities. • We have too many free parameters in our model. This can lead to overfitting resulting in high generalisation error. • The framework is computationally intensive. • Since the disease spread model has quite a bit of stochasticity (e.g., the incubation time), we need to perform multiple runs of our simulator and take an average of the outputs. We do not have an estimate on the variability of our outputs across multiple runs; such an analysis will be essential to determine the number of runs we need to perform in order for our outputs to be close to the average. Generation of a synthetic city is performed via the following steps. 1) Data gathering and data preparation involves the following. For instantiating households in high-density areas, we sample locations either from a GeoJSON file with boundaries of the high-density areas or from a collection of pre-sampled locations of households in high density areas. Common areas where community interactions take place are instantiated at the ward centres, assumed to be the centroids of the polygons. These tasks are accomplished using the following python packages: numpy, random, pandas and shapely. The outputs of this stage are collections of the instantiated individuals, their assigned households, schools, workplaces, transport and community areas. (c) Additional processing for generating city files: Before generating the city files, additional processing is done on the dataframes which includes computing the distance of the individuals to their respective ward centres. This stage uses the pandas package for processing and generating the city files in the JSON file format for each instantiated collection namely the individuals, households, workplaces, schools, community centres, and distance between wards. The disease progression part of the simulator is broadly implemented as follows. There are four time steps on a given day. At each time step, we go through each susceptible agent and find out the infection rate given by either (1) • Contact tracing requires us to maintain a list of contacts made by each agent. In our implementation, we assume that each individual has a certain number of contacts that we can trace (which is random, but independent of n). As a result, the space complexity becomes O(n) instead of O(n 2 ). • In the age-stratified interaction as well as OD-matrix based distance kernel, we consider dominant terms of the age-based contact rate matrix as well as OD-matrix by doing a principal component analysis and by focusing on a few important components. This helps simplify the summations in (3). These optimisation features appear to be novel features of our simulator. In this work, we built an agent-based simulator to study the impact of various nonpharmaceutical interventions in the context of the ongoing COVID-19 pandemic. We demonstrated the capabilities of our simulator via various case studies for Bengaluru and Mumbai. Some of the key features of our simulator include age-stratified interaction that captures heterogeneity in interaction among people in a given interaction space, the ability to implement various interventions such as soft ward containment, phased opening of workplaces and community spaces, a broad class of contact tracing based testing and case isolation protocols, etc. These features help our simulator to capture the ground reality very well and provide us with realistic predictions. Some future directions include bringing in movement of people into and out of the city and studying the impact of various mobility patterns, modelling and studying the impact on public-health oriented decisions on the economy, incorporating activity modelling into our simulator and using the simulator to obtain district-scale or country-scale predictions. We hope that such agent-based simulators find a regular place in every public health official's tool kit. Ministry of Health and Family Welfare, Government of India COVID-19 India-Timeline: An Understanding across States and Union Territories Crowdsourced COVID19-India Database Containing papers of a mathematical and physical character INDSCI-SIM A state-level epidemiological model for India A minimal and adaptive prediction strategy for critical resource planning in a pandemic Spatio-temporal predictive modeling framework for infectious disease spread Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand Intervention strategies against covid-19 and their estimated impact on swedish healthcare capacity Modeling targeted layered containment of an influenza pandemic in the United States A taxonomy for agent-based models in human infectious disease epidemiology Modelling disease outbreaks in realistic urban social networks Face mask use and control of respiratory virus transmission in households Mask use, hand hygiene, and seasonal influenza-like illness among young adults: A randomized intervention trial Respiratory virus shedding in exhaled breath and efficacy of face masks Physical distancing, face masks, and eye protection to prevent person-person covid-19 transmission: A systematic review and meta-analysis Consolidated Revised Guidelines on the measures to be taken by Ministries/Departments of Government of India, State/ UT Governments and State / UT authorities for containment of COVID-19 in the country MHA order dt. 30.5.2020 on phased re-opening (unlock 1) COVID-19 Community Mobility Report Epidemiologically and socio-economically optimal policies via Bayesian optimization Urban Poverty And Transport : The Case Of Mumbai. The World Bank Strategies for containing an emerging influenza pandemic in Southeast Asia Projecting social contact matrices in 152 countries using contact surveys and demographic data Age-structured impact of social distancing on the COVID-19 epidemic in India Estimates of the severity of coronavirus disease 2019: a model-based analysis The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study European Centre for Disease Prevention and Control's COVID-19 website COVID-19 Epidemic Study II: Phased Emergence From the Lockdown in Mumbai Adaptive cyclic exit strategies from lockdown to suppress COVID-19 and allow economic activity