key: cord-0168615-zoxl871o authors: Guo, Xiaotong; Wang, Shenhao title: Transit Frequency Setting Problem with Demand Uncertainty date: 2022-04-27 journal: nan DOI: nan sha: 845c652200cba7568959cef3448f67453c750b62 doc_id: 168615 cord_uid: zoxl871o Public transit systems are the backbone of efficient and sustainable urban mobility systems in the era of urbanization. This paper proposes a transit frequency setting model for a single transit line to generate transit schedules that could better serve passengers. The proposed model optimizes both service patterns and frequency for transit routes during a service period, and crowding levels on transit vehicles are also considered. To handle demand uncertainties when designing transit schedules, both stochastic programming and robust optimization techniques are introduced to protect transit schedules against inaccurate demand estimates. To address the computation complexity in both extended models under large-scale demand matrices, a Bender decomposition algorithm and two dimension reduction techniques are designed. The proposed models will be tested with real-world data from Chicago Transit Authority (CTA). The past century has witnessed one of the most dramatic evolution in human history, urbanization, where more than half of the world now lives in urban areas. By 2050, over two-thirds of the world's population will live in urban areas [1] . Urban mobility, defined as moving people from one location to another location within or between urban areas, is critical to the functioning of people's everyday lives in urban areas, allowing people to access housing, jobs, and recreational services. On the other hand, the transportation sector is the largest contributor to greenhouse gas emissions in the United States, accounting for over 27% of the total greenhouse gas emissions [2] . Therefore, an efficient and sustainable urban mobility system is necessary to support future urban development. Although emerging urban mobility services, e.g., Ride-hailing and bike-sharing, have offered people various options when traveling, the public transit system keeps serving as the backbone of a sustainable urban mobility system, which allows more efficient movements across cities for a mass number of people. Meanwhile, the public transit system provides affordable travel options for everyone regardless of travel distances within cities. Therefore, designing a public transit system with a good level of service and operating it efficiently is critical for transit agencies. The COVID-19 pandemic has an enormous impact on public transits systems, as the national public transportation ridership stays around 60% of the post-pandemic ridership level at the beginning of 2022 [3] . One of the main driving forces for the ridership drop is the flexible or remote working adopted by many employers worldwide during the pandemic. However, remote working won't be a temporary strategy for companies, as the US is projected to have an average of 30% paid full days working from home for people in the future compared to a 5% pre-pandemic level [4] . The remote working implies that a proportion of commute trips in transit will be lost permanently, which motivates transit agencies to redesign their transit networks and schedules. Also, transit demand has become more volatile, and predicting future demand can be challenging. While transit networks have been developed for years and are hard to change by transit agencies within a short period of time, changing transit schedules is straightforward. In this paper, we focus on the transit frequency setting problem, where transit schedules are optimized given a set of transit stops to serve. On the other hand, limited papers on the transit frequency setting problem incorporate uncertainty into consideration [5] , which could have profound impacts on the level of service for transit systems. To handle demand uncertainty for transit systems, especially during the post-COVID time, a baseline transit frequency setting model for a single transit line is proposed in this paper first, following the introduction of two techniques for considering uncertainty within decision-making processes: Stochastic Programming (SP) and Robust Optimization (RO). Furthermore, algorithms are proposed to efficiently solve both extended models. Overall, the contribution of this paper can be summarized as follows: • A nominal transit frequency setting model for a single transit line is proposed and an extended model considering crowding levels is formulated. • A stochastic transit frequency setting model is proposed and a Benders decomposition algorithm is designed to solve the problem efficiently. • A robust transit frequency setting model is proposed. To make the model tractable to solve given large-scale demand matrices, a model simplification technique a stop consolidation algorithm is introduced. The remainder of the paper is organized as follows. Section 2 reviews the relevant literature. Section 3 describes the nominal, stochastic and robust transit frequency setting models and solution algorithms. Finally, Section 4 recaps the main contributions of this work, outlines the limitations, and provides future research directions. The design and planning of urban public transit systems consist of a series of decisions before operating the system, which is known as Transit Network Planning (TNP) problem. In literature, TNP is commonly divided into sub-problems that range across tactical, strategical, and operational decisions, including Transit Network Design (TND), Frequency Setting (FS), Transit Network Timetabling (TNT), Vehicle Scheduling Problem (VSP), Driver Scheduling Problem (DSP), Driver Rostering Problem (DRP). A thorough review of TNP and its subproblems can be found in [5, 6, 7] . The transit frequency setting problem is defined as a problem to determine the number of trips for a given set of lines that provide a high level of service in a planning period. The transit frequency setting problem is first studied by Newell [8] using analytic models. Given a fixed number of vehicles and constant passenger arrival rate, Newell [8] produced vehicle dispatching time in order to minimize the total waiting time of all passengers. He concluded that the optimal headway should be approximate as the square root of the arrival rate of passengers. His proposed model assumes fixed passenger demand and overlooks vehicle capacity constraints. Furth and Wilson [9] formulated the transit frequency setting problem as a non-linear program that computed the optimal headway for bus routes in order to maximize the net social benefits, consisting of ridership benefit and wait-time savings. Sets of constraints incorporated in their model were total subsidy, maximum fleet size, and acceptable level of loading. A key assumption they made was considering responsive demand which was a function of headway in the model. Furthermore, a heuristic-based algorithm was designed to solve non-linear programs. More recently, Verbas and Mahmassani [10] extended the model proposed by Furth and Wilson [9] by incorporating service patterns into transit routes. A service pattern corresponds to a unique set of steps that need to be served by transit vehicles along a transit route. They formulated two non-linear optimization problems with different objectives: i) maximize the number of riders and wait-time savings, and ii) minimize the net cost. Nonlinear optimization solvers were directly used to solve non-linear programs. Additionally, Verbas et al. [11] discussed the impact of demand elasticity over solutions from the transit frequency setting model which is similar to models proposed by Furth and Wilson [9] and Verbas and Mahmassani [10] . They introduced three methodologies for estimating demand elasticity within transit networks and solved the transit frequency setting problem under multiple demand elasticity scenarios on a large-scale network. Although the impact of demand uncertainty is discussed in this paper, their proposed methods are not equipped with abilities to generate optimal schedules considering demand uncertainty explicitly. One could argue that one of the modeling contributions in formulations based on Furth and Wilson [9] 's model is the introduction of responsive demand. However, the authors claim that it is more reasonable to consider a fixed demand matrix when solving the transit frequency setting problem. There are short-term and long-term objectives in the transit frequency setting problem: i) minimizing wait times for existing passengers, and ii) attracting more passengers to use transit networks. Minimizing wait times for the existing passengers leads to an increase in service level, which in turn attracts more passengers to take transit. On the contrary, maximizing ridership when considering responsive demand could lead to a waste of resources given demand might take weeks to respond to service changes. Also, transit schedules can be modified monthly or quarterly by solving a new transit frequency setting problem with an updated demand matrix. Therefore, minimizing wait times for existing or predicted passengers is a better objective in the authors' opinion. Although limited papers take demand uncertainty into consideration when setting transit frequencies, Li et al. [12] utilized stochastic programming techniques to solve the headway optimization problem for a single bus route considering random passenger arrivals, boarding, alighting, and vehicle travel time. A metaheuristic algorithm consisting of a stochastic simulation and a genetic algorithm was designed to solve the proposed model. Their proposed approach was compared with three traditional headway determination models and bringing both demand and travel time uncertainty improved model performances. The main critique for Li et al. [12] 's work is the lack of discussions on the optimality gap that give a heuristicbased solution algorithm. In this paper, we will propose a stochastic transit frequency setting model with a Benders decomposition algorithm which can be used to find optimal solutions efficiently. There are two widely-used approaches for decision-making under uncertainty in the Operations Research (OR) domain: stochastic programming [13] and robust optimization [14] . For the stochastic programming approach, the most traditional method is Sample Average Approximation (SAA), where the true distributions over uncertain parameters are approximated by empirical distributions obtained from the data [15] . On the other hand, robust optimization and its data-driven variants [16] is another option to handle uncertain parameters effectively. The underlying idea for robust optimization is to specify a range for an uncertain parameter, namely an uncertainty set, and optimize over the worst-case realizations given the bounded uncertainty set. The solution method for robust optimization problems involves generating a deterministic equivalent, called the robust counterpart. A practical guide on robust optimization can be found in [17] . Urban mobility systems have various sources of uncertainty brought by human behaviors and environmental impacts (e.g., weather). Considering uncertainty when designing and operating urban mobility systems is important and necessary. There are several applications for applying SP or RO techniques to solving urban mobility problems. For transit systems, Yan et al. [18] proposed a robust framework for solving the bus transit network design problem considering stochastic travel times. Mo et al. [19] utilized the robust optimization technique to solve the individual path recommendation problem under rail disruptions considering demand uncertainty. For shared mobility systems, Guo et al. [20] formulated a robust matching-integrated vehicle rebalancing model to balance vacant vehicles in the ride-hailing operations given demand uncertainty. Guo et al. [21] extended the matchingintegrated vehicle rebalancing model proposed by Guo et al. [20] by introducing predictive prescriptions approach [22] to handle demand uncertainty, which is an advanced approach for handling data uncertainty based on the stochastic optimization framework. We consider the problem of setting optimal transit frequencies, including rail and bus services, over an urban transit line with a sequence of N stop S. Without loss of generality, we assume each bi-directional transit line is considered as two separate transit lines with distinct sets of stops in this problem. For an urban transit line, there exists a set of service patterns P where each pattern p ∈ P consists of a subset of stops S p ⊆ S. Common examples for patterns are short-turnings and limited-stop lines in bus operations. Let V represent the set of vehicle types which can be operated on the transit line. For instance, V = {standard bus, articulated bus, minibus} includes three types of buses, and V = {four-car train, six-car train, eight-car train} consists of three types of rail cars with different number of carriages. For each type of vehicles v ∈ V, the number of seats is C v and the maximum vehicle capacity isC v . Furthermore, we discretize the full planning period into time periods t = 1, ..., T , where each time interval t has an identical length ∆. Let passenger flow with type v ∈ V operating on a pattern p ∈ P departures from the start of pattern p at the beginning of time interval t. In the real-world transit line operation, each line can not have too many operating patterns. Therefore, we impose the sparsity constraint on operating patterns. Let's introduce auxiliary decision variable y p , ∀p ∈ P, where y p = 1 indicates that the pattern p can be operated on the transit line. Let P represent the maximum number of patterns operated on a single transit line and the sparsity constraint can be formulated as Let c p,v stand for the cost parameter associated with operating a vehicle with type v on a pattern p. The budget for scheduling transit services over the transit line is represented by B. The set of feasible schedules is denoted by The feasibility constraints in Equation ( 2) ensure that the total scheduled transit services do not exceed the budget B and only one type of vehicle can be operated on each pattern during each time interval t 1 . v∈V x p,v t = 0 implies that no service with pattern p is scheduled during time t. It is worth noting that a general budget constraint is imposed in Equation (2) and can be modified to incorporate more complicated cases. For instance, the budget constraint can be adapted to the following constraints if we have limited number of vehicles for each vehicle type: where B v is the number of available vehicles for each vehicle type v and the cost parameter , d, t) . Then, we have the following Integer Linear Programming (ILP) formulation for setting the optimal frequency for an urban transit line: The objective function (4a) minimizes the total generalized journey time for passengers who take transit services and penalty of unsatisfied passenger flows. γ is a weight parameter controlling the importance between waiting time and in-vehicle travel time. γ = 0 leads to a problem which only minimizes passengers' waiting time, and γ = 1 generates a problem which minimizes passengers' journey time. M stands for a large number which dominants the objective function (4a), indicating that passenger flows must be served in the transit system. Constraints (4b) impose that all passengers from a passenger flow will board on vehicles or stay unsatisfied. Constraints (4c) guarantee that passenger loads on vehicles do not exceed the vehicle capacity. Constraints (4d) and (4e) make sure decision variables λ and η are non-negative. The ILP with crowding extension can be formulated as follows: Besides the objective for problem (P ), the crowding penalty for transit vehicles is also added to the objective function as (5a), which leads to a transit schedule and passenger boarding choices minimizing the crowding levels. When ω = 0, the problem (5) optimizes transit schedules which minimizes the total generalized journey time for passengers given passengers will board first available transit vehicles. When ω > 0, we assume passengers can wait for the next transit vehicles in order to reduce the crowding levels, which contradicts passengers' behaviors in reality. However, our model is able to provide a perspective of system optimum, which shows the trade-off between passengers' total waiting time and crowding levels in transit vehicles. Updated capacity constraints are shown in Constraints (5b). Constraints (5c) restrict that a vehicle can only be crowded if it is operated in the system. Constraints (5d) specify decision variable z is binary. (P − C) min x∈X B ,L p,v,s τ ≤ C v x p,v τ + (C v − C v )z p,v, The demand matrix u o,d t in the problem (P ) is critical and existing literature assumes a constant demand matrix u estimated from the historical data. In this paper, we first introduce the stochastic programming (SP ) model into the basic frequency setting problem (P ) to design transit schedules against demand uncertainty. Given a set of demand scenarios E, the corresponding demand matrix u e for a demand scenario e ∈ E has probability p e . By introducing demand scenarios into the frequency setting problem, we adjust the boarding decision variables for passengers to λ e = (λ o,d,p,v t,τ,e ) for each demand scenario e ∈ E, where λ o,d,p,v t,τ,e ∈ R + represents the number of passengers in the passenger flow (o, d, t) who board on a vehicle v which departures at the beginning of pattern p at time τ under demand scenario e. Similarly, auxiliary variables η are extended to η e = (η o,d t,e ) for each demand scenario e ∈ E. Then the SP formulation of transit line frequency setting problem is: The problem (SP ) is a stochastic programming extent ion of the basic optimization problem (P ), and we minimize the expected total generalized journey time and penalties induced by unsatisfied demand across all demand scenarios. The number of variables and constraints grows linearly regarding to the number of demand scenarios |E|. The ILP problem (SP ) can be intractable when the number of scenarios is large. To address the computation bottleneck, we introduce the Benders decomposition approach to decompose problem (SP ) into a two-stage optimization problem and solve largescale problems more efficiently. By applying the Benders decomposition approach to the problem (SP ), the master problem of k-th iteration in the Benders decomposition algorithm can be formulated as where (α e ,β e ) and (ᾱ e ,β e ) indicate optimal solutions and extreme rays for dual subproblem of demand scenario e, respectively. And sets of optimal solutions and extreme rays up to k-th iteration of the Benders decomposition algorithm are represented asÛ Terminate with infeasible stochastic transit frequency setting problem (SP ) 6: for demand scenario e ∈ E do 7: Solve sub-problem (SP − e) and derive the optimal solution (λ k e , η k e ) and dual optimal solution (α k e ,β k e ); if the sub-problem is infeasible, derive the dual extreme ray (ᾱ k e ,β k e ) 8: if sub-problem (SP − e) is optimal then if sub-problem (SP − e) is unbounded then Terminate with unbounded stochastic transit frequency setting problem (SP ) 14: if ∀e ∈ E, θ k e = f (λ k e , η k e ) then 15: return Optimal solutions {x k , λ k e , ∀e ∈ E, η k e , ∀e ∈ E} By applying the Benders decomposition algorithm, one small-scale ILP and |E| mediumscale LPs will be solved at each iteration comparing to solving a large-scale ILP directly. The proposed algorithm will produce an optimal solution within finite number of iterations. However, the convergence are not guaranteed to be fast for every instances. To further accelerate the proposed algorithm, we will integrate Benders cuts into ILP solvers and details will be discussed in the experiments section. Besides using SP to handle demand uncertainty when setting transit frequencies, robust optimization (RO) [14] is another approach commonly used in literature for decision making under uncertainty. Compared to SP where generated transit schedules are optimal for an average demand scenario, RO produces transit schedules which are optimized against the worst-case demand scenario. The motivation of introducing RO into transit frequency setting is intuitive; transit operators would prefer no passengers suffer from excessive wait times given any demand scenarios. To construct a robust transit schedule optimization model, we define the uncertainty set around uncertain demand parameter u o,d t . The uncertainty set specifies a range for the uncertain demand u o,d t where demand can change to any levels within the range. Transit schedules are then generated using RO techniques with respect to the worst-case demand scenario in the uncertainty set. First, we define uncertainty sets for uncertain parameter u. In this paper, we introduce a budget uncertainty set proposed by Bertsimas and Sim [24] , which is widely used in literature, to quantify the demand uncertainty in the transit frequency setting problem. Let µ o,d t , σ o,d t denote the mean and standard deviation of the demand of passenger flow (o, d, t) derived from the historical data, respectively. The budget uncertainty set is defined as where Γ is a parameter controlling the level of uncertainty for the budget uncertainty set. The budget uncertainty set implies that the demand can deviate from its historical average by at most one standard deviation, and the total absolute deviations for all passenger flows is upper-bounded by Γ. Define an uncertain parameter ζ ∈ R |F | and let u o,d We have the following reformulated uncertainty set: With the defined uncertainty set over demand vector u, we propose the robust transit line frequency setting problem: Constraints (13b) in problem (RO) are equality constraints with uncertain parameters which often restrict the feasibility region drastically or even lead to infeasibility [17] . There, we eliminate variables η o,d lemma [25] . it is satisfied by x if and only if there exists an auxiliary variable y such that (x, y) satisfies By applying Lemma 1 to constraints (13b) and linearize the problem, we can derive the robust counterpart for problem (RO ′ ): ≤C v x p,v τ ∀p ∈ P, ∀v ∈ V, ∀s ∈ S p , ∀τ = 1, ..., T ; (14l) ≥ 0 ∀(o, d, t) ∈ F, ∀p ∈ P, ∀v ∈ V, ∀τ = 1, ..., T. Constraints (14b) -(14f) are the robust counterpart corresponds to constraints (13b) while constraints (14g) -(14k) are the robust counterpart corresponds to constraints (13d). Compared to problem (RO ′ ), the robust counterpart (RC) introduces (|F | 2 + 2|F | + 1) new auxiliary non-negative continuous variables and (|F | 2 + 2|F |) new inequality constraints. When the number of distinct passenger flows |F | is not large (e.g., blow 1,000), the robust counterpart (RC) can be directly solved by off-the-shelf ILP solvers. However, the problem (RC) can be intractable when |F | is large (e.g., above 10,000). In the next section, we will discuss the typical problem size for the transit frequency setting problem under a single line context and propose methods to handle large-scale transit frequency setting problems. In this section, we will propose a method for increasing tractable of solving the robust transit frequency setting problem given a large-scale demand matrix. Methods will be proposed to reduce the size of demand matrix and make the robust counterpart (RC) tractable to solve by off-the-shelf ILP solvers. Take a bus line and a rail line in Chicago Transit Agency (CTA) for example. The inbound direction of the CTA Blue line includes 33 stations in total, which leads to 528 distinct OD pairs for passengers. When solving the transit frequency setting problem under a one-hour time interval with 12 decision time periods of length ∆ = 5, the number of passenger flows is |F | = 6, 336. Formulating the robust counterpart (RC) introduces 40, 157, 569 new continuous variables, which is a large-scale problem but might still be solvable by a powerful machine within days. On the other hand, the north bound direction of the CTA route 49 bus contains 82 stops overall, which gives 1, 176 distinct OD pairs for passengers. Under the same setting as the Blue line, there will be 14, 112 unique passenger flows and the robust counterpart (RC) introduces 199, 176, 768 new continuous variables. This is a problem size that is intractable due to the lack of memory for loading the ILP into machines. These two instances imply that large-scale demand matrices commonly exist in practice, and methods need to be designed to reduce the size of demand matrices in robust transit frequency setting problems. The first method to reduce demand matrices come from the following observation: transit demand matrices are sparse and only a subset of passenger flows are chosen by passengers. Passengers using transit services have clear spatial and temporal patterns, which lead to the sparsity in demand matrices. When applying it to the robust problem with the average demand µ, the problem dimensionality can still be large. Considering the demand data from one month, a passenger flow (o, d, t) has to be incorporated inF if it has demand for at least one day. We utilize a probabilistic scenario to better explain this issue. If a passenger flow (o, d, t) has 90% probability to have zero demand in one day, the probability of not having a positive mean demand for 30 days is 0.9 30 = 4.24%. When considering a month of demand data, the probability of excluding the passenger flow (o, d, t) from the problem shrinks from 90% to 4.24%, indicating that Proposition 1 is not effective for the robust problem when considering demand data from multiple days. To further reduce the number of passenger flows in transit frequency setting problems, we propose a stop consolidation algorithm to combine stops into stop groups and construct new demand matrices based on stops groups. The proposed stop consolidation algorithm is described in Algorithm 2. The underlying idea is to calculate an activity score for each stop and combine the pair of consecutive stops with the lowest combined activity score. The activity score for stop s is defined as where ω 1 , ω 2 stand for parameters controlling weights for boarding activities and alighting activities, respectively. In the transit frequency setting problem, wait times are main focuses which are impacted by boarding activities. Parameters ω 1 and ω 2 are designed to reflect importance of boarding and alighting activities. The activity score is the total demand related to stop s. A low activity score indicates that less passengers have boarding and alighting activities at stop s. The proposed stop consolidation algorithm reduces the number of stops while decreasing the impact to the original problem to a maximum degree. Input: Set of stops S, desired stop number k * , demand matrix u Output: Updated set of stopsS, updated demand matrixū 1 Overall, two proposed methods help to solve transit frequency problems with large-scale demand matrices. The first approach maintain the optimality and the stop consolidation algorithm is a heuristic method which could lead to sub-optimal solutions. In this paper, an ILP is formulated to solve the transit frequency setting problem and an extended model considering crowding levels on transit vehicles is proposed as well. To handle the demand uncertainty within the problem, a stochastic transit frequency setting model and a robust transit frequency setting model are introduced. A Benders decomposition algorithm is designed to solve the stochastic transit frequency setting model efficiently. To solve the robust transit frequency setting model under large-scale demand matrices, two techniques for reducing problem dimensions are described: i) solving a reduced-form model, and ii) implementing a heuristic-based stop consolidation method. The experiments for testing model performances with real-world instances are under development. The draft paper will be updated once the experiments section is finished. Urbanization Sources of greenhouse gas emissions American Public Transportation Association Why working from home will stick Planning, operation, and control of bus transport systems: A literature review A review of urban transportation network design problems Public Transit Planning and Operation Dispatching Policies for a Transportation Route Setting Frequencies on Bus Routes: Theory and Practice Optimal allocation of service frequencies over transit network routes and time periods Stretching resources: sensitivity of optimal bus frequency allocation to stop-level demand elasticities Expected value model for optimizing the multiple bus headways Introduction to Stochastic Programming, ser. Springer Series in Operations Research and Financial Engineering Robust optimization, ser. Princeton series in applied mathematics The sample average approximation method for stochastic discrete optimization Data-driven robust optimization A practical guide to robust optimization Robust optimization model of bus transit network design with stochastic travel time Robust Path Recommendations During Public Transit Disruptions Under Demand Uncertainty Robust matching-integrated vehicle rebalancing in ride-hailing system with uncertain demand Data-driven vehicle rebalancing with predictive prescriptions in the ride-hailing system From predictive to prescriptive analytics The price of robustness Robust and adaptive optimization. Belmont, Massachusetts: Dynamic Ideas LLC The authors would like to thank Chicago Transit Authority (CTA) for offering data availability for this research. t,τ ∀(o, d, t) ∈ F, ∀ζ ∈ U (Γ). (12) Substituting Constraints (12) into the objective function (13a) and introducing a dummy variable ω transform problem (RO) into a problem formulation without equality constraints: