key: cord-121935-uilzmmxu
authors: Mo, Baichuan; Feng, Kairui; Shen, Yu; Tam, Clarence; Li, Daqing; Yin, Yafeng; Zhao, Jinhua
title: Modeling Epidemic Spreading through Public Transit using Time-Varying Encounter Network
date: 2020-04-09
journal: nan
DOI: nan
sha: 
doc_id: 121935
cord_uid: uilzmmxu

Passenger contact in public transit (PT) networks can be a key mediate in the spreading of infectious diseases. This paper proposes a time-varying weighted PT encounter network to model the spreading of infectious diseases through the PT systems. Social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of coronavirus disease 2019 (COVID-19) as a case study along with smart card data from Singapore to illustrate the model at the metropolitan level. A scalable and lightweight theoretical framework is derived to capture the time-varying and heterogeneous network structures, which enables to solve the problem at the whole population level with low computational costs. Different control policies from both the public health side and the transportation side are evaluated. We find that people's preventative behavior is one of the most effective measures to control the spreading of epidemics. From the transportation side, partial closure of bus routes helps to slow down but cannot fully contain the spreading of epidemics. Identifying"influential passengers"using the smart card data and isolating them at an early stage can also effectively reduce the epidemic spreading.

Infectious diseases spread through social contacts, such as at schools (Salathé et al., 2010; Litvinova et al., 2019) or conferences (Stehlé et al., 2011) . Past studies proved that human mobility networkslike air transportation (Colizza et al., 2006; Balcan et al., 2009) or waterways Gatto et al., 2012 )-could transport pathogens or contagious individuals to widespread locations, leading to the outbreak of epidemics. Recently, the outbreak of coronavirus disease 2019 confirmed the strong connection between human mobility network and disease dynamics. The first case of COVID-19 was reported in Wuhan, China, at the beginning of Dec. 2019, and has then quickly spread to the rest of China through airlines and high-speed rail networks during the Spring Festival travel season (Wu et al., 2020) . Besides the transmissions of pathogens to destination local communities via human mobility network, the densely populated urban public transit (PT) network, however, may also become a key mediate in the spreading of influenza-like epidemics with public transport carriers being the location of transmission (Sun et al., 2013 ).

The PT system in large metropolitan areas plays a key role in serving the majority of urban commuting demand between highly frequented locations, as human trajectories present a high degree of spatiotemporal regularity following simple, reproducible patterns (Gonzalez et al., 2008) . By the end of 2017, the annual patronage of urban metro systems worldwide increased from 44 billion in 2013 to 53 billion, and in Asia, the systems carry more than 26 billion passengers a year (International Association of Public Transport (UITP), 2018). The urban PT system is often framed as a key solution for building sustainable cities with concerns of environment, economy, and society's effectiveness (Miller et al., 2016) . But the indoor environment created by crowded metro carriages or buses can also make an infected individual easily transmit the pathogen to others via droplets or airborne routes (Xie et al., 2007; Yang et al., 2009) . In recent years, scholars began to turn their attention to the spreading of the epidemic through the urban PT network.

Rooted in peoples daily behavior regularity, individuals with repeated encounters in the PT network are found to be strongly connected over time, resulting in a dense contact network across the city (Sun et al., 2013) . Such mobility features lead to great risks for the outbreak of infectious diseases through bus and metro networks to the whole metropolitan area (Sun et al., 2014; Liu et al., 2019) . Based on the contact network developed by Sun et al. (2013) , a variation of human contact network structures has been proposed to characterize the movement and encounters of passengers in the PT system, which are then used to model the epidemic spreading among passengers (Bóta et al., 2017a,b; Hajdu et al., 2019; El Shoghri et al., 2019) .

However, previous studies focusing on the human contacts in PT systems often use a static passenger contact network, discarding the time-varying nature of encounters. The aggregation of the time-varying edges into a static version of the network offers useful insights but can also introduce bias (Perra et al., 2012; Coviello et al., 2016) . Besides, most of the previous studies focused on understanding epidemic spreading and identifying the risks in the PT system. Few studies have discussed the PT operation-related epidemic control strategies. PT operation plays an important role in controlling epidemics. Recently, a variety of epidemic control strategies in PT systems have been implemented to respond to the outbreak of COVID-19 since late January 2020. For example, in Wuhan, almost all PT services have been shut down since Jan.

24th. In Wuxi, another Chinese big city, except the 22 arterial bus routes kept running with shortened operation hours, all other PT services (roughly 92% of bus routes) were suspended since Feb. 1st. In Milan, Italy, the PT services were still in operation, but the suspension of PT has been officially proposed with the rapid surge of COVID-19 cases in the Lombardy area. The impacts of these strategies and other possible PT operation strategies (e.g., distributing passengers' departure time, limiting maximum bus load), however, have seldom been carefully explored.

To fill this gap, this study proposes a time-varying weighted PT encounter network (PEN) to model the spreading of the epidemic through urban PT systems. The social activity contacts at both local and global levels are also considered. We select the epidemiological characteristics of COVID-19 as the case study along with high-resolution smart card data from Singapore to illustrate the model at the metropolitan level.

Different control policies from both the public health side and the transportation side are evaluated.

In this work, we do not attempt to reproduce or predict the patterns of COVID-19 spreading in Singapore,

where a variety of outbreak prevention and control measures have been implemented (Ministry of Health (MOH), 2020) and make most of epidemic prediction models invalid. Instead, since the PT systems in many cities share the similar contact network structure despite the differences in urban structures, PT network layouts and individual mobility patterns (Qian et al., 2020) , this study aims to employ the smart card data and the PT network of Singapore as proximity to the universal PEN to better understand the general spatiotemporal dynamics of epidemic spreading over the PT system, and to evaluate the potential effects of various measures for epidemic prevention in the PT systems, especially from the PT operation angle.

The main contribution of this paper is threefold:

• Propose a PT system-based epidemic spreading model using the smart card data, where the timevarying contacts among passengers at an individual level are captured.

• Propose a novel theoretical solving framework for the epidemic dynamics with time-varying and heterogeneous network structures, which enables to solve the problem at the whole population level with low computational costs.

• Evaluate various potential epidemic control policies from both public health side (e.g., reducing infectious rate) and transportation side (e.g., distributing departure time, closing bus routes)

The rest of the paper is organized as follows. In Section 2, we elaborate on the methodology of establishing contact networks and solving the epidemic transmission model. Section 3 presents a case study using the smart card data in Singapore to illustrate the general spatiotemporal dynamics of epidemic spreading through the PT system. In Section 4, conclusions are made and policy implications are offered.

The majority of previous studies investigated the epidemic process in a static network, where the spreading of the disease is virtually frozen on the time scale of the contagion process. However, static networks are only approximations of the real interplay between time scales. Considering daily mobility patterns, no individual is in contact with all the friends simultaneously all the time. On the contrary, contacts are changing in time, often on a time scale that is shorter than the whole spreading process. Real contact networks are thus inherently dynamic, with connections appearing, disappearing, and being rewired with different characteristic time scales, and are better represented in terms of a temporal or time-varying network. Therefore, modeling the epidemic process on PT should be based on a time-varying contact network.

Although we focus on the contagion process through PT, passengers' social-activity (SA) contacts besides riding the same vehicles are not neglectable. In this study, two components of the contact network are considered: 1) a PEN that is designated to capture the interaction of passengers on PT, 2) and an SA contact network that captures all other interactions among people.

PT Passengers' encounter patterns have been studied by Sun et al. (2013) through an encounter network, which is an undirected graph with each node representing a passenger and each edge indicating the paired passengers that have stayed in the same vehicle. The network is constructed by analyzing the smart card data, which includes passengers' tap-in/tap-out time, location, and corresponding bus ID. Since PEN provides direct contact information of passengers, it is an ideal tool to investigate the epidemic spreading through PT.

Extending the work by Sun et al. (2013) , we propose a time-varying weighted PEN to model the epidemic process. We first evenly divide the whole study period into different time intervals t = 1, ..., T . The length of each interval is τ . For a specific time interval t, consider a weighted graph G t (N , E t , W t ), where N = {i : i = 1, .., N } is the node set with each node representing an individual, N is the total number of passengers in the system; E t is the edge set and W t is the weight set. The edge between i and j ( i, j ∈ N ), denoted as e t ij , exists if i and j have stayed in the same vehicle during the time interval t. The weight of e t ij , denoted as w t ij , is defined as w t ij = d t ij τ , where d t ij is the duration of i, j staying in the same vehicle during time interval t. By definition, we have 0 ≤ w t ij ≤ 1. The weight is used to capture the reality that epidemic transmission is related to the duration of contact.

In addition to contacts during rides on the PT, passengers may also contact each other during their daily social activities. Given the heterogeneity of passengers' spatial distributions, people may have various possibilities to contact with different people. However, capturing the real connectivity of passengers in social activities requires a richer dataset (e.g., mobile phone, GPS data), which is beyond the scope of this research.

In this study, we made the following assumptions to build the SA contact network.

• Global interaction: Passengers may interact with any other individuals in the system during a time interval of t with a uniform probability of θ g .

• Local interaction: Passengers with same origins or destinations of PT trips may interact with each other during time interval t with a uniform probability θ l . Since local interaction is more intense than global interaction, we have θ l > θ g .

For the global interaction, we assume that the contact time for all connected individuals is τ for a specific time interval if there are no PT and local contacts between them. Otherwise, the contact time should be subtracted by the PT and local contact duration (CD) at that time interval. For the local interaction, the contact time calculation is illustrated by the following example. Consider passenger i with PT trip sequence

where t k is the time when the passenger board or alight the vehicles. O i t k and D i t k are the trip origin and destination, respectively. The trip sequence is defined as a sequence of consecutive PT trips where every adjacent trip pair has an interval of fewer than 24 h (e.g., t 3 − t 2 < 24 h). We call the interval between two consecutive PT trips (e.g., [t 2 , t 3 ]) as activity time hereafter. Since passengers may not stay in the same place between two consecutive trips, we may have D i t2 = O i t3 . We further assume that from time t 2 to t 3 , the passenger spends half of the activity time at D i t2 and half of the activity time at O i t3 . Suppose passenger j has a trip sequence

)}, and D j t 2 = O i t3 ; and the overlapping time between intervals [t 2 , t 3 ] and [t 2 , t 3 ] are not zero. This means passengers i and j may have local contact because they have stayed in the same place D j t 2 = O i t3 (by definition, the probability of having local contact is θ l ). Recall that we assume that passengers spend half of the activity time at a specific origin or destination. If they have a local contact, then the CD between passengers i and j is calculated as half of the overlapping time between interval [t 2 , t 3 ] and interval [t 2 , t 3 ]. This calculation gives us the total CD of i and j at the local interaction level. For example, if t 2 < t 2 < t 3 < t 3 , the total local CD between i and j is 1 2 (t 3 − t 2 ). Analogizing to the PEN, the total local CD can be mapped to each time interval. For example, if t * is the time boundary for time interval t and time interval t + 1, and t * − τ < t 2 < t * < t 3 < t * + τ .

Denote the local CD between i and j for time interval t asd l,t ij (0 ≤d l,t ij ≤ τ ). Then we haved l,t ij = t * − t 2 andd l,t+1

We denote the SA contact network asG t (N ,Ẽ g,t ,Ẽ l,t ,W g t ,W l t ), whereẼ g,t is the edge set of global interaction;Ẽ l,t is the edge set of local interaction. The edge of global interaction between any i and j, denoted asẽ g,t ij , exists with probability θ g for all i, j ∈ N . When i and j share the same PT trip origins or destinations during time interval t, the edge of local interaction between i and j (ẽ l,t ij ) exists with probability θ l .W g t andW l t are the weight set for global and local interaction edges, respectively. By the discussion above, we havew l,t ij =d l,t ij τ for allw l,t ij ∈W l t andw g,t ij = 1 −w l,t ij − w t ij for allw g,t ij ∈W g t . By definition, the contacts from three sub-networks (local, global, and PT) are mutually exclusive.

To illustrate the proposed epidemic contact network, we present a five-passenger system with a single bus route (N = 5) in Figure 1 . We consider the time period from 7:00 to 10:00 with τ = 1 h. For illustrative purpose, we neglect the global interaction and set the local interaction insensitivity θ l = 1. The bottom of the graph shows the passengers trajectories along the bus route. In the time interval t = 1, at 7:30, passengers 1,2, and 5 board the bus; since they share the same origin and also are in the same bus during t = 1, we have edges of PEN (colored green) and edges of location interaction of SA contact network (colored orange). Accordingly, we have d 1 12 = d 1 25 = d 1 15 = 0.5 h. The weights are calculated as w 1 12 = w 1 25 = w 1 15 = 0.5/1 = 0.5. In the meanwhile, from the trajectories at t = 2, we noticed passengers 3 and 4 also share the same origin.

Thus, we also have an SA contact edge between 3 and 4 at t = 1.

The local CD for passengers 1,2, and 5 at time interval t = 1 isd l,1 12 =d l,1 25 =d l,1 15 = 1 2 × 0.5 = 0.25 h. The 1 2 comes from the assumption that these passengers only spend half of their time around this bus station (see Section 2.1.2). Hence, the corresponding weights for the SA contact network arew l,1 12 =w l,1 25 =w l,1 15 = 0.25/1 = 0.25. Similarly, we haved l,1 34 = 1 2 × 1 = 0.5 andw l,1 34 = 0.5. The weights for t = 2 and t = 3 are calculated in the same way. 

The epidemic transition model is independent of the network representation. We can model various infectious diseases based on the proposed PEN using different epidemic transition frameworks. For the case study we considered (COVID-19), we employed the Susceptible-Exposed-Infectious-Removed (SEIR) diagram in this study. The SEIR model is generally used to model influenza-like illness and other respiratory infections. For example, Small and Tse (2005) used this model to numerically study the evolution of the severe acute respiratory syndrome (SARS), which shares significant similarities with the COVID-19.

We first divided the population into four different classes/compartments depending on the stage of the disease (Anderson et al., 1992; Diekmann and Heesterbeek, 2000; Keeling and Rohani, 2007) : susceptibles (denoted by S, those who can contract the infection), exposed (E, those who have been infected by the disease but cannot yet transmit it or can only transmit with a low probability), infectious (I, those who contracted the infection and are contagious), and removed (R, those who are removed from the propagation process, either because they have recovered from the disease with immunization or because they have died).

By definition, we have N = S ∪ E ∪ I ∪ R, where N is the set of the whole population.

The diagram of the SEIR model is shown in Figure 2 . The diagram shows how individuals move through each compartment in the model. The infectious rate, β, controls the rate of spread and is associated with the probability of transmitting disease between a susceptible (S) and an exposed individual (E). The incubation rate, γ, is the rate of exposed individuals (E) becoming infectious (I). Removed rate, µ, is the combination of recovery and death rates. The SEIR model typically assumes the recovered individuals will not be infected again, given the immunization obtained. It is worth noting that this study focuses on the early stage of an epidemic process, where the impact of outside factors on N (e.g., birth and natural death) are not considered. For the epidemic process models, people are concerned about the steady-state, epidemic threshold and reproduction number. According to Pastor-Satorras et al. (2015) , the number of infected individuals in the SEIR model always tends to zero after a long term (see Figure 3a ). This is obvious from the diagram of SEIR (Figure 2) , where there is only one recurrent state R. The basic reproduction number, denoted by R 0 , is defined as the average number of secondary infections caused by a primary case introduced in a fully susceptible population (Anderson et al., 1992) . In the standard SEIR model, we have R 0 = β µ . Epidemic threshold, in many cases, is defined based on the value of R 0 . When R 0 < 1, the number of infectious individuals tends to decay exponentially; thus, there is no epidemic. However, if R 0 > 1, the number of infectious individuals could grow exponentially with an outbreak of epidemic (see Figure 3b ). 

The typical modeling of epidemic falls into two different categories: the individual-based approach and the degree-based approach. Generally, the individual-based approach models the epidemic transmission at the individual level while the degree-based approach captures the infection process at the group level, where each group includes a set of nodes (individuals) with the same degree. Since in the PEN, we can characterize the behaviors of human interaction at the individual level, the individual-based framework is used in this study.

We denote S i,t , I i,t , E i,t , and R i,t as the Bernoulli random variable that describes whether individual i is in class S, I, E, and R at time interval t, respectively (Yes = 1). By definition we have S i,t +I i,t +E i,t +R i,t = 1 for all i and t. Let P(X i,t = 1) = p X i,t , where X ∈ {S, I, E, R} and X p X i,t = 1. Since the contact network is defined in discrete time, we can describe the epidemic process of the SEIR model as a discrete Markov process with specific transition probabilities. To match with the epidemiological characteristics of COVID-19, we assume that the exposed individual can also infect others based on the recent finding (Rothe et al., 2020) , which may not be the common case in the SEIR model.

Let β I be the probability of a susceptible individual i ∈ S getting infected by an infected individual j ∈ I at a time interval t if i and j contact each other (either by PT or SA) for the entire time interval. Since the actual transmission probability is related to the interaction duration, we can write the actual probability of i getting infected by j (β I i,j,t ) as

where h(·, ·) is a function to describe the actual transmission probability with respect to CD. It can be a form of survival function (e.g., exponential, Weibull) or a linear function (i.e., h(w, β) = wβ, which is used in the case study). a t ij (ã l,t ij ,ã g,t ij ) is an indicator variable showing whether e t ij (ẽ l,t ij ,ẽ g,t ij ) exists. It is worth noting that a t ij is a known constant butẽ l,t ij andẽ g,t ij are random variables with Bernoulli distribution: a l,t ij ∼ B(L t ij θ l ) andã g,t ij ∼ B(θ g ), where L t ij = 1 if i and j share the same origin or destination at time interval t and L t ij = 0 otherwise (see Section 2.1.2 for details). Therefore, we have

Similarly, we define β E as the probability of a susceptible individual i ∈ S getting infected by an exposed individual j ∈ E at time interval t if i and j contact each other for the entire time interval (β E β I ). The actual transmission probability considering interaction duration is

Note that if i and j have been in contact, we assume the transmission probability only depends on the CD.

The variation of transmission probability due to spatial distribution is neglected. Capturing spatial factors requires dedicated transmission models (e.g., WellsRiley model (Wells et al., 1955) ) and an assumption of passengers' spatial distribution in a vehicle, which can be done in the future work.

Let γ be the probability of E → I, which is unrelated to the network. µ is the probability of I → R and The notations and epidemic transmission mechanism allow us to write the following system equations:

Calculating P(S i,t = 1, X j,t = 1) requires the joint distribution of S i,t and X j,t , which is usually unavailable.

According to the individual-based mean-field approximation, we can assume that the state of neighbors is independent (Hethcote and Yorke, 2014; Chakrabarti et al., 2008; Sharkey, 2008 Sharkey, , 2011 . Hence, this leads to

By plugging Eqs. 8 and 9 into Eqs. 4 and 5, we can get a new group of solvable system equations.

Different from the typical SEIR model, the proposed epidemic model with the individual-based PEN has two challenges. First, the infection rate in a typical SEIR model is defined at the population level (i.e., homogeneous network assumption). However, in the proposed framework, we consider one-to-one contagious behaviors at the individual level with heterogeneous contact networks. The heterogeneity is difficult to characterize by probabilistic models (e.g., degree distributions) because contact structures are known from the smart card data. Second, the proposed framework lies on a time-varying network, for which the contagious behaviors and interacted individuals vary over time.

One of the solution methods for Eqs. 4 -7 is simulation. Similar to many other complex stochastic process, simulation can output approximate values for p X i,t for all X ∈ {S, I, E, R} and t. The simulation process is described in Algorithm 1, wherep X t (X ∈ {S, I, E, R}) is the proportion of people in class X for time interval t. Initialization can assign some seed infectious people in the system. At each time step t, we calculate the one-to-one transmission probability β I i,j,t and β E i,j,t for each person in class S, where the time complexity is O(N 2 ). Therefore, the total time complexity of simulation is O(N 2 T ), where T is the total number of time intervals considered. The model also requires to store the network structures and individual states at each time step, where the space complexity is also O(N 2 T ).

Algorithm 1 Simulation-based solving algorithm

Assign I i,t = 1 with probability γ.

10:

Assign R i,t = 1 with probability µ = µ r + µ d

Let

14:

else 15:

Assign R i,t = 1, and

, 2, ..., T }, and X ∈ {S, I, E, R}.

Given the O(N 2 T ) of time and space complexity, the simulation-based solving framework is hard to scale up to the whole population level (4.7 million in our case study). According to the numerical results, N ≥ 300k would cause memory errors to a 32 GB RAM personal computer.

Considering computational costs, a scalable and lightweight theoretical model is proposed to handle the epidemic simulations. The theoretical model, though simplified from agent-level simulation, can retain the flexibility to capture the behavioral, mechanical, networked, and dynamical features of the simulation-based models.

The framework is separated into three steps. 1) We first build up a multi-particle dynamics model for the epidemic process to represent the individual-based model. 2) Considering properties of contact network and multi-particle dynamics (Gao et al., 2016) , an effective model is employed to represent the multi-dimensional dynamics (individual-based) into one-dimension (mean-based). 3) Previous effective models are developed for static network structure. To fit with the time-varying contact network, we innovatively combine the effective model with a temporal network model by adding energy flow into the equations, from which we can capture the impact of the time-varying contact networks on the entire dynamical system. For the multi-particle dynamics model, we focus on the early stage of the epidemic process, where the percent of susceptible people is almost 100%, and recovered people are 0%. Hence, we could use Taylor expansion to simplify the four-dimension (S,E,I, and R) individual dynamical epidemic process described in Eq. (4 -7) to two dimensions (E and I):

In this formula, we embedded the dynamical network structure into two tensors

These two tensors are non-negative, and each temporal slice of these tensors (i.e.,

are symmetric due to the property of infection. According to the previous studies (Gao et al., 2016; Tu et al., 2017) , the infectious burst in this canonical system could be captured by a one-dimensional simplification of the individual-based model. This simplification is based on the fact that in a network environment, the state of each node is affected by the state of its immediate neighbors. More details on the simplification can be found in Gao et al. (2016) and Tu et al. (2017) . Therefore, we can characterize the effective state of the system using the average nearest-neighbor activity:

where

is the effective proportion of exposed (infectious) people in the system at time interval t. If we assume that all individuals hold a uniform probability to come into contact with each other, p E eff,t and p I eff,t are good proxies forp E t andp I t , wherep E t andp I t are the actual proportion of exposed and infectious population (i.e.,

. However, this assumption may not hold in reality. The relaxation of the assumption will be described later. p E eff,t and p I eff,t allow us to reduce the individual-based equations (Eq. 10 and 11) to an effective mean-based equations:

where:

Considering that people's interaction probabilities are actually heterogeneous, in practice, to relax the uniform contact assumption, we further consider the dynamics of the mobility network on the multi-particle systems based on Li et al. (2017) , which recommends adding the energy flow f X t = (1/N 2 ) i,j∈N (β X i,j,t ) 2 (X ∈ {E, I}) into the general dynamical process:

are parameters to be estimated. The energy flow and corresponding parameters are expected to capture the heterogeneous contacts in the network.

The theoretical model is calibrated from a two-layer regression method. In the first layer, given a trajectory of epidemic process:

Eq. 17 and 18. This leads to a linear regression problem with a total of T samples:

where the only unknown parameters are K; β X eff,t and f X t are calculated from the constructed contact network, andp X t is given (X ∈ {E, I}). Therefore, K can be obtained for every given epidemic trajectory. The epidemic trajectory is generated using the simulation-based solving framework for a small sample size (e.g., 100k).

In the second layer, we aim to obtain the relationship between K and epidemic/mobility parameters (i.e.,

For every combination of Θ, we can use the simulation model to generate a trajectory and thus to estimate K (as we described above, the first layer regression). Therefore, based on different values of Θ, we can estimate a series of K. Then, we assume a linear relationship between Θ and K, and use a linear regression model to fit the relationship based on the generated Θ, K pairs. After the two-layer regression, the theoretical model can be used to flexibly predict the epidemic process under different policy conditions (i.e., different Θ, β E , or β I ).

The theoretical model can smoothly consider different sizes of the studied population by scaling effective exposed proportion p E eff,t , effective infectious proportion p I eff,t , and network energy flows f E t and f I t . Those variables can be directly extracted once the contact network is constructed. This model can also efficiently test different policy combinations with a low computational cost. According to the numerical test, it can evaluate one-million policy combinations for the full population (4.7 million) within seconds. The memory and computational complexity are both O(T ). This allows us to find the optimal policy to control the contagion.

In epidemiology, the basic reproduction number (expressed as R 0 ) of the infection can be viewed as the expected number of cases directly generated by one case in a population where all individuals are susceptible to infection (Fraser et al., 2009 ). The most important use of R 0 is to determine whether an emerging infectious disease would spread throughout the population. In a common infection model, if R 0 > 1, the infection starts to spread throughout the population, but not if R 0 < 1 (see Figure 3b ). In general, the larger the value of R 0 , the more difficult it is to control the epidemic (Fine et al., 2011) .

In the ideal SEIR model where diseases spread uniformly over time and people have a uniform contact probability, R 0 is easy to define. To match with the discrete-time expression in this study, letβ be the average number of people infected by one infectious person within one time interval in the ideal SIR system, andμ be the probability that the infectious people (I) are removed (R) within one time interval (Note that in continuous time context,β andμ represents infectious rate and removal rate, respectively). The basic reproduction number for ideal SEIR model is calculated as

where I t is the number of infectious people at time interval t. Note that It+1−It It = β, ∀t only holds under the ideal SEIR system.

In heterogeneous populations, the definition of R 0 is more subtle. The definition must take into account the fact that the contact between people is not uniform. One person may only contact a small group of friends and be isolated from the rest of the population. On the temporal evolving side, people's mobility patterns may vary every day, resulting in time-varying contact networks. This defeats ideal SEIR system assumptions. To consider the network heterogeneity Damgaard et al. (1995) , we define R 0 as "the expected number of secondary cases of a typical infected person in the early stages of an epidemic, which focuses on the expected direct infected population for each time step at the early stage

Let E t and I t be the number of exposed and infectious people at time interval t. Given a trajectory of epidemic process [(E t , I t )] t=1,...,T (either from the simulation model or the theoretical model), we define the equivalent reproduction number for time period T (R 0 (T )) as:

We assume that the incubation period 1 γ is much longer than a time interval, which holds true for most of the diseases (e.g., in our case study, 4 days 1 h). Therefore, E t+1 − E t in this study is a good proxy for The equivalent R 0 enables flexible and fair comparison of different epidemic processes with heterogeneous contact and time-varying networks. In the following sections, we will use the defined equivalent R 0 as a major epidemic measure for different policy discussions.

We use the Singapore bus system as a proximity to demonstrate the dynamics of epidemic spreading through a PT network based on the developed time-vary PEN approach. The time-varying PEN is constructed based on daily mobility patterns in the bus system and is then calibrated according to the epidemiological characteristics of COVID-19. A series of disease control policies are evaluated to exhibit the sensitivity of the developed PEN.

Singapore is a city-state country where inter-city land transportation is relatively small. This provides an ideal testbed to focus on epidemic spreading through intra-city transportation, especially for bus systems, which count for a high proportion of modes shared in Singapore (Mo et al., 2018; Shen et al., 2019) . According to Singapore Land Transport Authority (LTA) (2018), the average daily ridership of buses is around 3.93 million, accounting for almost half of all travel modes. There are more than 368 scheduled bus routes operated by four different operators. A total of approximately 5,800 buses are currently in operation. In the case study, the mass rapid transit (MRT) system is neglected because a) passengers' contacts in a bus are more conducive for epidemic transmission compared to the MRT system, given the limited space in a bus; b) smart card data can provide exact bus ID to identify the direct contact of passengers. The direct contact in trains is, however, difficult to obtain from smart card data because the transactions are recorded at the station level.

The smart card data used in this study are from August 4th (Monday) to August 31st (Sunday), 2014, with a length of four weeks. The dataset contains 109.2 million bus trip transaction records from 4.7 million individual smart cardholders. Given that the population of Singapore in 2014 is around 5.5 million, the smart card data is representative of the population (accounting for 84% of the population) and can model the epidemic spreading for the whole city. Figure 5 shows the hourly ridership distribution for one week (average of four weeks). The ridership of weekdays shows highly regular and recurrent patterns with morning (8:00-9:00 AM) and evening peaks (18:00-19:00). While the ridership distributions on weekends are different from those on weekdays, there are no prominent peaks observed. The usage of bus systems is related to daily activities, which represent mobility patterns of metro travelers in Singapore and can influence the epidemic spreading. Figure 6a shows the distribution of trip duration (P (T D)) in four weeks, where T D means trip duration. Most trips have a duration of fewer than 40 min.

From the inset of Figure 6a , we found the tail of P (T D) can be well characterized by an exponential function:

when T D ≥ 10 min, we have P (T D) ∼ e − T D λ td , where λ td = 11.94 min calculated by regression. As people tend to use MRT for long-distance travel, the duration of bus trips is relatively short. On average, the duration of bus trips is 14.55±12.51 min (mean±standard deviation). In summary, Singapore has an intense usage of bus systems with high ridership and user frequency, though the trip duration is relatively small. This implies that for highly infectious diseases that can be infected by short-term exposure, the bus system may play a crucial platform for the epidemic spreading.

As discussed in Section 2.1, the PEN and local interaction network (LIN) highly depend on passengers' mobility patterns and present time-varying properties. Figure 7 shows example networks of 100 passengers extracted from real-world data (7:00-10:00 AM). The length of time interval τ = 1 h is used in this study.

For better visualization, these passengers are chosen from the same bus, and θ l = 1 is used for LIN. Both The property of the contact network is essential for analyzing the epidemic spreading. Figure 8 summarizes the degree and CD distribution of PEN and LIN. Note that the global interaction network is, by definition, a simple random graph with homogeneous structures. Hence, we did not plot it. Given the time-varying properties of networks, we consider three different time intervals: morning peak (8:00-9:00 AM), noon off-peak (12:00-13:00), and evening peak (18:00-19:00). Figures 8a and 8b show that the degree distribution of PEN displays a power-law tail (P (k) ∼ k −λ k , where k is the degree), implying a significant degree heterogeneity. Most of the nodes are of low or medium degree. The number of super-nodes with a high degree is limited, and the maximum degree is bounded, which is reasonable given the limited capacity of buses. These properties are consistent with the findings in Qian et al. (2020) and indicate that PENs are a type of small-world network (Telesford et al., 2011) . Although the shapes of P (k) for different times are similar, the exact values are still time-dependent. On weekdays, P (k) for morning and evening peaks are similar but different from the off-peak curve. PENs in peak hours also have a larger degree of nodes. On weekends, however, the degree distributions in the three time intervals are similar. Figures 8c and 8d show the degree distribution of the LIN (θ l = 1× 10 −3 is used to correspond to the case study in the following sections). Similar to PEN distribution, a power-law tail is also observed. However, the degree distribution in LINs is less concentrated and shows noisy patterns for high degree distributions. We also find that other local CD values are nearly uniformly distributed. Since local interaction duration that does not equal 30 or 60 min indicates that the trip occurs or ends in this time interval, the uniform distribution implies a Poisson start and end time of bus trips within the time interval.

COVID-19, also known as 2019-nCoV, is an infectious disease caused by "SARS-CoV-2", a virus closely Figure 9 shows the number of confirmed (infectious), cured, and dead people from Jan 24 to Feb 20, 2020, in Wuhan. Up to Feb 20, there are more than 40 thousand confirmed COVID-19 cases. The total number of healed and dead patients is around 6,000 and 2,000, respectively. The sudden increase in confirmed cases on Feb 12 is due to the revision of diagnosis criteria (adding the cases of clinical diagnosis). (2020)). The inset plot is the zoom-in of the number of cured and dead people.

We select COVID-19 as the case study for the following reasons: a) COVID-19 is extremely contagious.

It is primarily spread between people via respiratory droplets from infected individuals when they cough or sneeze. According to CDC (2020), anyone who has been within approximately 2 m of a person with COVID-19 infection for a prolonged period (more than 1 or 2 min) is considered risky to get infected. Therefore, PT can play a significant intermediary for such a highly contagious disease. b) Even when authors are writing this article, COVID-19 is a big threat to global public health. Singapore is also experiencing the impact of COVID-19 (Ministry of Health (MOH), 2020). The case study of COVID-19 can provide disease control suggestions from the transportation side, which adds real-time value to this research.

The SEIR model parameters are chosen based on the epidemiological characteristics of COVID-19. Time from exposure to onset of symptoms (latent or incubation period) is generally between 2 to 14 days for COVID-19. Read et al. (2020) suggested setting the latent period as 4 days. We, therefore, have γ = 1 24×4 = 0.0104 (probability from E to I per h). According to Read et al. (2020) , the transmission rate of COVID-19 in the static SEIR model is 1.96 day −1 , which can be seen as the number of people that one infectious person can infect per day in a well-mixed network. Therefore, assuming one person, on average, has close contact with 100 others per day, we can calculate the hourly one-to-one infectious probability as β I = 1.96 24×100 = 8.17 × 10 −4 . Although recent studies show β E > 0 for COVID-19 (Rothe et al., 2020) , calibrating the exact value of β E is difficult due to lack of data. Since people in the latent period (group E) usually have extremely lower probability of transmission, we arbitrarily set β E = 0.01β I . We calculate µ r and µ d using data from Wuhan. Figure 10 shows the daily cure and death rate (number of cured/dead people per day divided by the total number of confirmed people on that day) in Wuhan. The reason for the high value on the first day may be the inaccurate data. From Figure 10 , we observe the average daily cure and death rate are approximately 1% at the early stage. Therefore, we can calculate the hourly cured and death probability as µ r = µ d = 0.01 24 = 4.17 × 10 −4 . Figure 10 : Daily cure and death rate in Wuhan (data sources: Ding Xiang Yuan (2020)) θ l = 1 × 10 −3 is used for the status quo analysis. This value is calculated as follows. Consider a community with 10,000 people, assume each person may have close contact with another 10 people on average per hour locally. We therefore have θ l = 10 10,000 = 1 × 10 −3 . The global interaction captures individual's probability of close contact with people outside his/her community. Given that the population of Singapore is around 5.6 million, we assume one person, on average, can closely contact 10 people per day globally. Then θ g = 10 5.6×10 6 ×24 = 7.44 × 10 −8 . Note that the number 24 in the denominator is used to get the hourly probability. Table 1 summarizes all parameters of the status quo analysis, which can be seen as the reference scenario.

The sensitivity analysis column indicates whether this value will be changed in the following policy analysis sections. 

We first calibrate the theoretical model using the generated epidemic dynamics from the simulation model. 320 cases with different combination of parameters (Table 2) for 100k sample passengers are simulated. These cases are fed into a regression model to obtain the parameters for the theoretical model. Figure 11 shows the comparison of the number of infectious and exposed people between the simulation and the calibrated theoretical model. We observe a high goodness-of-fit for the theoretical models, which implies the proposed theoretical framework can capture the epidemic spreading through PT and SA contacts. Figure 12 shows the comparison of the number of infectious and exposed people by time for the five selected cases. Generally, the simulation and theoretical models show a similar number of infectious people over time. For the number of exposed people, the two models show similar dynamic fluctuations with only a slight difference for some periods. Human mobility θ l [1 × 10 −3 , 1 × 10 −2 , 1 × 10 −4 , 0] θ g [7.44 × 10 −8 , 1 × 10 −7 , 1 × 10 −9 , 0] (a) Comparison on infectious people (b) Comparison on exposed people Figure 11 : Comparison between simulation model and calibrated theoretical model (100k sample passengers). One dot represents # of infectious/exposed people in a specific time interval.

(a) Comparison on infectious people (b) Comparison on exposed people Figure 12 : Comparison of infectious/exposed people by time for 5 cases (100k sample passengers)

Based on the parameters in Table 1 , we evaluate the epidemic process in Singapore for all smart cardholders (4.7 million) using the calibrated theoretical model. Figure 13 shows the dynamics of the number of infectious and exposed people. We randomly assign 30 initial infectious passengers in the system. Results

show that if there are no control policies for the disease, the number of infectious people will increase to more than 3,000 (100 times the initial value) after four weeks. This is consistent with Liu et al. (2020) 's results on early human-to-human transmission of COVID-19 in Wuhan. The equivalent R 0 is 2.67, which is consistent with many previous estimates: 2.03.3 (Majumder and Mandl, 2020); 2.6 (Imai et al., 2020); 2.92 (Liu et al., 2020 ).

The inset plot shows the intra-day dynamics of the number of exposed people in week 4, from which we observe the high intensity of infections from 7:00 to 22:00. The sudden increases in morning and evening peaks for weekdays (Day 22-26) highlights the transmissibility through PT. During the weekends (Day 27

and 28), the number of exposed people shows a decreasing trend, which implies lower transmission rates on weekends. Figure 13 : Epidemic process in status quo scenario (whole population). The inset plot shows the zoom-in of the number of exposed people from Day 22 (Monday) to Day 28 (Sunday).

Motivated by current epidemic control strategies worldwide in PT systems, especially the control policies of COVID-19 exemplified in Appendix A, we can hardly find the general criteria to determine whether, when, and where to suspend the urban PT services. The time-varying weighted PEN developed in this work contributes to better understanding of the spatiotemporal impacts of various PT operation strategies for epidemic control, to facilitate the decision-making process for PT operation adjustments.

We first evaluate the impact of β I and µ. β I is related to people's preventive behavior, such as wearing masks and sanitizing hands, which results in a decrease of β I . µ is related to the hospital's medical behavior, such as increasing cure rate and developing vaccines, which leads to an increase in µ. Figure 14 shows the impact of β I and µ on equivalent R 0 . β I is scaled from 10 −2 to 10 1 and µ is scaled from 10 −1 to 10 2 . We fixed β E = 0.01β I for all testing. Figure 14b shows that the epidemics would fade out (equivalent R 0 < 1) if transmissibility was reduced to less than 10 −1 of current value. However, Figure   14c suggests that even if the cure rate is increased by 100 times, the epidemics would still happen, though the process would be lagged (with smaller R 0 ). This implies reducing transmissibility is more effective than increasing the cure rate for COVID-19. In Figure 14a , we show the joint impact of β I and µ and the critical bound of equivalent R 0 . If β I was decreased to 32% and µ was enlarged tenfold, the equivalent R 0 would be decreased to less than 1. If the cost of controlling each parameter is given, this graph can help to optimize the controlling strategies with limited costs. 

One of the typical control strategies for the epidemic is decreasing the trip occurrence rate in a city.

At the average level, this is equivalent to reducing the total contact time, and total squared contact time 1 . Figure 15 shows the impact of different control percentage of trips (i.e., reducing the percentage of total contact and squared contact time). We observe that controlling one trip (PT or local or global) cannot eliminate the epidemic. Based on current parameter settings, reducing PT trips contribute more to the control of the epidemic process than the other two. The impact of reducing trips to R 0 is generally linear unless the reduction percentage is sufficiently large. When all trips are reduced by more than 80%, the reduction rate for R 0 starts to accelerate. When all trips were decreased by 98%, the spreading process would fade out, which implies travel control can only be effective at an extreme level. This is corresponding to Wu et al. (2020) 's statement that a 50% reduction in inter-city mobility in Wuhan has a negligible effect on the COVID-19 epidemic dynamics. Figure 16 shows the influence of distributing departure time with different flexibility (from 0 to ±110 min). Note that the benchmark equivalent R 0 is 1.763 for the sample passengers situation, rather than 2.67 as the whole population situation because fewer passengers in the system can reduce the human contacts and limit the epidemic process. We observe a decline in equivalent R 0 as the departure time flexibility increases. This is because higher flexibility allows more dispersed riding on the bus; thus, fewer contacted passengers. However, the effectiveness of distributing passengers is very limited. With ±110 min flexibility, there is only a 1.6% decrease in R 0 (from 1.763 to 1.740). 

As what has been summarized in Section 1, the closure of bus routes is an in-practice implemented strategy from the transportation side to reduce people's close-contacts during the outbreak of COVID-19.

We assume that the suspension of bus service is a sign of travel restrictions on the corresponding bus routes.

While keeping the SA contact network the same, we evaluate four different strategies of closing various percentages of bus routes (from 10% to 90%): a) Close from high demand to low demand routes (H-L). b)

Close from low demand to high demand routes (L-H). c) Close by randomly picking bus routes (Random). d) Close by different local planning areas. We assume that passengers who originally take these closed bus routes will change to alternative routes if available; otherwise, they will cancel their trips. Again, given the computational burden, we evaluate this policy for sample passengers. high-demand bus routes can reduce the equivalent R 0 by 9.2%.

It is also important to look at how many passengers are affected by each strategy. The affected passengers are defined as those who cannot find alternative bus routes when the original routes are closed. As expected, for the same closing percentage of bus routes, the H-L strategy affected more passengers than the other two policies. However, we find that the H-L strategy is more effective in terms of R 0 reduction per affected passengers. If we fix the percentage of affected passengers at approximately 22% (black dashed line), the H-L strategy can reduce R 0 to 1.63, while random and L-H strategies can only reduce R 0 to 1.68 and 1.69, respectively. This may be because passengers in high demand bus routes are more influential (e.g., with a higher degree in the PEN) in the system. Therefore, to control the epidemic with fewer people getting affected, PT agencies should close buses from high to low demand routes. Figure 18a shows the percentage of reduction in R 0 resulting from the closure of bus routes by planning areas, while Figure 18b shows the corresponding percentage of affected PT passengers. Generally, the high reduction in R 0 is the result of a high number of affected passengers. Closing bus routes in the main business and residential areas in the Southern part of Singapore Island leads to higher controlling effects of epidemics than closing other areas. However, to minimize the impact on passengers' daily travel, PT agencies should first close bus routes in regions with relatively high R 0 reduction and a low number of affected passengers, such as core central business district (CBD) areas. Passengers who take buses crossing core CBD areas can easily find alternative routes. Thus, the concentrated demands at CBD areas can be distributed to other less crowded routes, which leads to a R 0 reduction and less number of affected passengers. 

Although closing bus routes can postpone the epidemic spreading, it also brings huge inconvenience to society, for example, decreasing people's accessibility to hospitals. A moderate alternative way is preserving the PT supplies but limiting the maximum bus load to reduce passengers' interaction. Figure 19 shows the impact of this policy. Since we use 100k sample data, the maximum bus load is relatively small. Hence, we test the maximum bus load from 10 to 2. Passengers who cannot board the bus due to this policy are assumed to cancel their trips (these passengers are called affected passengers). To compare with closing bus routes strategies, the x-axis is set as the percentage of affected passengers. Only the H-L strategy is plotted as it is the most effective one among all closing bus routes strategies. We find the policy of limiting the maximum bus load can only take effects when the available maximum bus load is very small. With the same percentage of affected passengers, it is not as effective as closing bus routes from high demand to low demand. However, since this policy preserves the city's mobility capacity, it can be seen as a more moderate way to control epidemics compared with directly closing bus routes. Figure 19 : Impact of limiting maximum bus load (100k sample passengers). MBL: maximum bus load. CP: close percentage of the H-L strategy 3.5.6. Impact of isolating critical passengers Ideally, a more precise pandemic policy goes into the individual-level. The government could find out those influential passengers who are potential to spread viruses considerably and get them isolated at an early stage. Individual-based policies can outperform region-based or population-level policies in effectiveness and flexibility. However, due to the computational cost, it is hard to optimize the isolation options for each individual directly. Hence, we employed an isolation method based on k-core decomposition. Different from traditional degree-based methods, k-core method shows higher impacts on the dynamics of multi-particle systems (Kitsak et al., 2010; Morone et al., 2019; Borge-Holthoefer and Moreno, 2012; Yang et al., 2017) .

A k-core of a graph G is a maximal connected subgraph of G in which all vertices have a degree of at least k. Each vertex in a k-core is connected to at least k other nodes (i.e., has a degree of at least k). A high k number represents the highly concentrated structure of the local network, which indicates the most clustered part in the whole network. Nodes in a core with larger k usually have larger degrees on average (Dorogovtsev et al., 2006) . In the context of infectious diseases, if one node in a high k-core is infected, it has, on expectation, > k 2 times chances to spread the disease to other nodes in that core in one-time step compared to an arbitrary node (if the network is under scaling law) (Serrano and Boguná, 2006) . Therefore, we designed the policy by first limiting the nodes in high k-cores, which means limiting the more influential nodes. The population in a higher k-core is always lower than that in a lower k-core. Thus, by applying this policy, we can isolate a small portion of people to limit the spreading of the disease. Figure 20 shows the impact of isolating passengers in different k-cores and the corresponding number of isolated passengers. For comparison purposes, we also evaluate a random policy. For each k-core, the random policy isolates the same number of randomly picked passengers in the system, which corresponds to implementing isolation at the population level. Since the number of passengers with a core number greater than 5 is low (in the 100k sample passengers case), the reduction in R 0 is not significant. However, isolating all 4-core passengers, which accounts for 5% of the whole population, the equivalent R 0 is reduced from 1.76 to 1.66 (5.7%), which shows higher effectiveness than any other region-based or route-based policies in Section 3.5.4. We also observe that the k-core isolating method can outperform the benchmark random isolating method. Figure 20 : Impact of isolating critical passengers (100k sample passengers). "base" and "k-core" scenarios indicate no isolation and isolation of people of core number ≥ k in the PEN, respectively.

This paper proposed a general time-varying weighted PEN to model the spreading of infectious diseases over the PT system. The social-activity contacts at both local and global level are also considered. The network is constructed using smart card data as an undirected graph, in which a node refers to a PT passenger; an edge refers to a pair of passengers staying in the same vehicle; the weight of an edge captures the CD. We employ the SEIR diagram-a general diagram to model the influenza-like disease-to model the disease dynamics using the recent global outbreak of COVID-19 as a case study. A scalable and lightweight theoretical framework is derived to capture the time-varying and heterogeneous network structures, which enables to solve the problem at the whole population level with low computational costs.

We use the PT smart card data from Singapore as a proximity to understand the general spatiotemporal dynamics of epidemic spreading over the PT networks in one month. The status-quo analysis shows that the COVID-19 infected population is expected to increase by 100 times of their initial value by the end of the month without any disease control enforcement. A series of disease control and prevention scenarios are envisioned from both public health policy and PT operations sides. From the public health side, the model sheds light on people's preventative behavior. Wearing face masks and sanitizing hands, are considered the most effective measures to control the spreading of the epidemic; however, an increased cure rate can only postpone the outbreak of the disease. From the perspective of PT operation adjustments, several policies are evaluated, including reducing trip occurrences, enlarging departure time flexibility, closing bus routes, limiting maximum bus loads, and isolating critical passengers. In general, the control of the epidemic process starts to take effect, with over 80% of all trips being canceled. The equivalent R 0 can be reduced to 1 if over 98% of trips are banned. In terms of bus operation policies, distributing departure times and limiting maximum bus loads can slightly decelerate the spreading process. Closing high-demand bus routes, especially in the main business areas, is more effective than the closure of low-demand bus routes. The most effective approach is isolating influential passengers at the early stage, in which the epidemic process can be significantly reduced with a small proportion of people being affected.

Many policy implications can be derived from the case study. On the public health side, the government should encourage people to take preventative behaviors, such as wearing face masks and sanitizing hands, to reduce the transmission probability. The travel restriction policy can take effect-with an equivalent R 0 less than 1-only at the extreme travel, such as in Hubei Province, China, where all travels were banned during the outbreaks of COVID-19 in Feb 2020.

On the PT operation side, according to our models, partial closure of bus routes and limiting the maximum bus load can postpone the spreading of epidemics. The most effective way is closing bus routes with high demands, especially those crossing the CBD areas. In practice, a (partially) shutdown of PT services is a serious decision for authorities, many related issues, such as equity and accessibility, should be considered to determine how to design the suspension of PT services in the pandemics. For prevention purpose, if possible, the government could identify the influential passengers with large core numbers based on smart card data, and suggest them to isolate themselves or reduce travels. Similarly, all entities should cancel events with a high number of participants to avoid generating large k-core contact networks.

Several limitations of this study are as follows: 1) Some parameters of the model (e.g., θ l , θ g ) are determined by the authors' assumptions, which hurts the credibility of the results. Although we end up with a reasonable R 0 , suggesting the values of these parameters are reasonable, more parameter calibration jobs should be done in the future. 2) Policy evaluations are based on some ideal assumptions. In the real world, many unexpected results can happen. For example, closing bus routes may decrease people's accessibility to the hospital, thus, decrease the cure rate. The government should think cautiously from multiple perspectives before applying any control strategies. 3) This study did not model the contacts of passengers at trains due to difficulties in identifying the vehicles that passengers belong to. Future research can incorporate a transit assignment model (e.g., Zhu et al., 2017; Mo et al., 2020) to infer passengers' boarding trains and construct the PEN by trains. Meanwhile, due to the large space in a train, the variation of transmission probability due to passengers' spatial distribution should also be captured. 4) Other modes of transmission than contact transmission are not considered (e.g., infectious passengers contaminating surfaces), which may result in under-estimation of transmissibility of PT systems. 5) Population infected heterogeneity is neglected in this study. In reality, the infectious probability may depend on age, gender and health conditions. The demographics distribution can be incorporated in the future.

Future works include the following: 1) Elaborate on the SA contacts based on other data sources (e.g., mobile phone data) and extend N to the whole population. Due to a lack of data, the contacts of social activities are simplified and N is assumed to be PT users in this paper. Though PT users in Singapore account for 84% of the population, future research can combine different data sources to model the SA contacts for the whole population in more detail (Wang et al., 2018) . 2) Incorporate spatial effect and model transmission probability more finely. The current transmission probability between two individuals only depends on the contact duration. The contact distance, passenger density, and distribution on a vehicle can be considered in future research. 3) Conduct case studies in cities with COVID-19 outbreaks (e.g., Wuhan, New York City) to validate the model. These case studies can calibrate the model based on ground truth data, quantify the contribution of disease transmission by PT systems, predict the epidemic spreading, and evaluate the effects of different policies. 4) Incorporate the time-varying epidemic and mobility parameters to better predict the reality. Although this study does not attempt to predict and reproduce the COVID-19 spreading, the proposed model can potentially better fit the epidemic process, given the more fine-grained framework.

However, the complexity in the real-world lies on the time-varying mobility and epidemic patterns. Future research can make the epidemic parameters (Θ) as time-dependent (Θ(t)), instead of constant, to better fit the reality.

The authors confirm contribution to the paper as follows: study conception: B. Shen. All authors reviewed the results and approved the final version of the manuscript.

The research is sponsored by the Natural Science Foundation of China (71901164) and the Natural Science In practice, since the outbreak of COVID-19 in late January 2020, a variety of epidemic control strategies in PT systems, such as the requirement of PT riders to wear face masks, the sterilization of bus and metro carriages, the adjustment of PT operation schedules, the closure of bus routes, etc., have been implemented during the outbreak of COVID-19.

In China, the requirement of wearing face masks in PT systems has been successively implemented in many provinces since late January of 2020. In addition, a variety of PT operation control strategies have also been enforced in many cities. In cities of Hubei Province, especially in Wuhan, along with the lockdown policies to control the spreading of COVID-19, almost all PT services have been shut down since Jan. 23rd and 24th, whereas the patients with severe symptoms are transported by ambulance. After the lockdown and travel restrictions of Hubei Province, the PT operation adjustment strategies implemented in other Chinese cities were largely diverse. Different from Wuxi with only 8% of arterial bus routes kept running, in Nanjiang, another major city of Jiangsu Province, the PT services were still in operation but with shortened operation hours and dispatching frequencies. In Shanghai, both inter-provincial PT services and the bus services between rural districts like Qingpu and Jinshan were closed from Jan. 27th, but most of the urban PT services remained in operation by limiting the maximum passenger loads.

Outside China, in Italy, where the reported COVID-19 cases dramatically increased in March 2020, the suspension of PT has been officially proposed in the Lombardy area, where the urban PT service

The Transport for London (TfL) is running reduced service across the network in London, closing up to 40 stations. No services of Waterloo and City line was provided since March 30, 2020. Many US cities have seen reduced services, including Boston and Washington D.C., In cities of other countries with fewer COVID-19 reported cases

Infectious diseases of humans: dynamics and control

Multiscale mobility networks and the spatial spreading of infectious diseases

Absence of influential spreaders in rumor dynamics

Identifying critical components of a public transit system for outbreak control

Modeling the spread of infection in public transit networks: A decision-support tool for outbreak planning and control

guidance for risk assessment and public health management of healthcare personnel with potential exposure in a healthcare setting to patients with

Epidemic thresholds in real networks

The role of the airline transportation network in the prediction and predictability of global epidemics

Predicting and containing epidemic risk using friendship networks

Social and sexual function following ileal pouch-anal anastomosis

Mathematical epidemiology of infectious diseases: model building, analysis and interpretation

Covid-19 real-time data

K-core organization of complex networks

How mobility patterns drive disease spread: A case study using public transit passenger card travel data

herd immunity: a rough guide

Pandemic potential of a strain of influenza a (h1n1): early findings

Universal resilience patterns in complex networks

Generalized reproduction numbers and the prediction of patterns in waterborne disease

Understanding individual human mobility patterns

Discovering the hidden community structure of public transportation networks

Gonorrhea transmission dynamics and control

Statistics brief -World metro

Comparing different approaches of epidemiological modeling

Stochastic dynamics. Modeling Infectious Diseases in Humans and Animals

Identification of influential spreaders in complex networks

Public transport utilisation: Average daily public transport ridership

The fundamental advantages of temporal networks

Reactive school closure weakens the network of social interactions and reduces the spread of influenza

Investigating physical encounters of individuals in urban metro systems with large-scale smart card data

Time-varying transmission dynamics of novel coronavirus pneumonia in china

Early transmissibility assessment of a novel coronavirus in wuhan, china. China

Modelling cholera epidemics: the role of waterways, human mobility and sanitation

Public transportation and sustainability: A review

2020. Past update on covid-19 local situation

Capacity-constrained network performance model for urban rail systems

Impact of built environment on first-and last-mile travel mode choice

The k-core as a predictor of structural collapse in mutualistic ecosystems

Epidemic processes in complex networks

Random walks and search in time-varying networks

Scaling of contact networks for epidemic spreading in urban transit systems

Novel coronavirus 2019-ncov: early estimation of epidemiological parameters and epidemic predictions

Transmission of 2019-ncov infection from an asymptomatic contact in germany

A high-resolution human contact network for infectious disease transmission

Percolation and epidemic thresholds in clustered networks

Deterministic epidemiological models at the individual level

Deterministic epidemic models on contact networks: Correlations and unbiological terms

Built environment and autonomous vehicle mode choice: A first-mile scenario in singapore

Small world and scale free model of transmission of sars

Simulation of an seir infectious disease model on the dynamic contact network of conference attendees

Efficient detection of contagious outbreaks in massive metropolitan encounter networks

Understanding metropolitan patterns of daily encounters

The ubiquity of small-world networks

Collapse of resilience patterns in generalized lotka-volterra dynamics and beyond

Inferring metapopulation propagation network for intra-city epidemic control and prevention

Airborne contagion and air hygiene. an ecological study of droplet infections. Airborne Contagion and Air Hygiene. An Ecological Study of Droplet Infections

Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study

How far droplets can move in indoor environmentsrevisiting the wells evaporation-falling curve

Small vulnerable sets determine large network cascades in power grids

The transmissibility and control of pandemic influenza a (h1n1) virus

A probabilistic passenger-to-train assignment model based on automated data