Bicycle Commuting in Melbourne During the 2000s Energy Crisis: A Semiparametric Analysis of Intraday Volumes Melbourne Business School From the SelectedWorks of Michael Stanley Smith 2011 Bicycle Commuting in Melbourne During the 2000s Energy Crisis: A Semiparametric Analysis of Intraday Volumes Michael S Smith, Melbourne Business School Goeran Kauermann Available at: https://works.bepress.com/michael_smith/29/ https://mbs.edu/home https://works.bepress.com/michael_smith/ https://works.bepress.com/michael_smith/29/ Bicycle Commuting in Melbourne During the 2000s Energy Crisis: A Semiparametric Analysis of Intraday Volumes Michael S. Smitha,� and Göran Kauermannb a Melbourne Business School, University of Melbourne b Department of Business Administration and Economics, University of Bielefeld 8th July 2011 To Appear in the Journal of Transportation Research Part B: Methodological � Corresponding Author; address for correspondence: Professor Michael Smith, Melbourne Busi- ness School, University of Melbourne, 200 Leicester Street, Carlton, VIC 3053, Australia. Email: mike.smith@mbs.edu 1 Abstract Cycling is attracting renewed attention as a mode of transport in western urban environ- ments, yet the determinants of usage are poorly understood. In this paper we investigate some of these using intraday bicycle volumes collected via induction loops located at ten bike paths in the city of Melbourne, Australia, between December 2005 and June 2008. The data are hourly counts at each location, with temporal and spatial disaggregation allowing for the impact of meteorology to be measured accurately for the first time. Moreover, during this period petrol prices varied dramatically and the data also provide a unique opportunity to assess the cross-price elasticity of demand for cycling. Over-dispersed Poisson regression models are used to model volumes at each location and at each hour of the day. Seasonality and the impact of weather conditions are modelled as semiparametric and estimated using recently developed multivariate penalized spline methodology. Unlike previous studies that use aggregate data, the empirical results show a substantial meteorological and seasonal com- ponent to usage. They also suggest there was substitution into cycling as a mode of transport in response to increases in petrol prices, particularly during peak commuting periods and by commuters originating in wealthy and inner city neighbourhoods. Last, we extend the approach to a multivariate longitudinal count data model using a Gaussian copula estimated by Bayesian data augmentation. We find first order serial dependence in the hourly volumes and a ‘return trip’ effect in daily bicycle commutes. Key Words: Cross-Price Elasticity; Discrete Copula Model; Generalized Mixed Models; Mode of Transport; Multivariate Penalized Spline Smoothing; Count Data Model. 1 Introduction Cycling offers many advantages as a mode of transport in western society. These include zero marginal emissions of greenhouse gases and pollutants, reduced traffic congestion and numerous health benefits, including reduced rates of obesity; see Pucher, Dill & Handy (2010) for extensive evidence on the health benefits of cycling. All of these are major contempo- rary issues, and it has lead to an increase in interest in the determinants of bicycle usage; for example, see Rietveld & Daniel (2004), Wardman, Tight & Page (2007) Hunt & Abra- ham (2007) and Heinen et al. (2010). To date, studies in the transportation literature have almost exclusively focused on the analysis of survey or experimental data which allow for the study of individual level factors on the decision to cycle; for example, see Nankervis (1999), Stinson & Bhat (2004), Shannon et al. (2006) and Xing et al. (2010). Analysis of data on cycling volumes that are aggregate with respect to individuals, but disaggregate over time and location, has not been attempted for two reasons. The first is a lack of appropriate data, while the second is the difficulty in controlling for the complex impact of weather conditions. This paper addresses both problems by exploiting a new and unique source of data, and by employing multivariate penalized spline smoothing to account for the impact of meteorology and seasonality. We analyse hourly cycling volumes at ten locations on dedicated bicycle paths in the city of Melbourne in Australia. These locations have been selected to provide an accurate picture of inner-city bicyle commutes and are collected automatically using recently installed induction loops placed under each path; in this sense the data are both comprehensive and highly accurate.1 We examine hourly cyclist counts from December 2005 to June 2008, which corresponds to a period in which oil prices experienced extreme levels of volatility and finishes just before the peak of US$147.29 per barrel on 11 July 2008.2 As a result, during this period 1These counters have been installed by VicRoads and they believe these count the number of cyclists who pass over them with over 95% accuracy. 2This is the all time peak at the time of writing. 1 average Melbourne unleaded petrol pump prices varied between a minimum of 108.5 cents per litre to 161.0 cents per litre. Therefore, this data provides a unique opportunity to assess the cross-elasticity of the demand for cycling with respect to petrol prices. We model hourly counts using over-dispersed Poisson regression models. Each hour of the day is modelled separately, thereby controlling for the strong diurnal variation in all aspects of the model. Such a separation is also common in the intraday modelling of demand for flow commodities, such as electricity (Fiebig, Bartels & Aigner 1991; Cottet & Smith 2003). The mean of each Poisson model contains linear effects for trend, day type and the logarithm of a measure of Melbourne petrol prices, so that the corresponding cross-elasticity is constant. The effect of seasonality and intraday weather conditions are accounted for as semiparametric effects. While transport researchers suspect weather and seasonality affect the propensity to cycle in quite different ways than other modes of transport (Heinen et al. 2010), to date there is only limited evidence to this effect. For example, Nankervis (1999) and Shannon et al. (2006) find minor and no weather effects, respectively, using limited datasets on Australian univer- sity students. In our study, by matching hourly observations from local weather stations with those from the induction loops we find that both immediate weather conditions and season are major determinants of bicycle commutes. In each Poisson model the impact of weather is modelled as an unknown multivariate function of six meteorological variables. We follow an approach advocated in the environmental (Shively & Sager 1999) and econometric (Pana- giotelis & Smith 2008) modelling literatures and simplify the effect as additive in univariate main effects and all pairwise bivariate interactions. This results in 21 unknown univariate and bivariate functions which are estimated jointly using penalized spline smoothing. Here, each function is modelled using a high-dimensional basis representation with unknown coef- ficients that are penalized with a quadratic penalty to ensure a smooth fit. Such an approach has a long history in the statistical literature, see Wahba (1978), Wecker & Ansley (1983) and O’Sullivan (1988) for early examples, but has seen much recent attention and extension; see the overviews by Ruppert, Wand & Carroll (2003; 2009) and Wood (2006). 2 We follow Wahba (1978), Wong & Kohn (1996), Wood (2003), Wand (2003) and Lang & Brezger (2004) and interpret the penalties as Gaussian prior distributions for the basis coefficients, so that the Poisson model can be interpreted as a generalized linear mixed model (GLMM). Scalar ‘smoothing parameters’ control the level of smoothing of each function, and these can be estimated from the data using a marginalized penalized likelihood derived from a Laplace approximation to the full penalized likelihood. The use of the Laplace approximation in such contexts is justified asymptotically in Kauermann, Krivobokova & Fahrmeir (2009) and Rue, Martino & Chopin (2009). The use of approximations to a full likelihood function is increasingly popular when the full likelihood function is computationally intractable. For example, in the transportation literature Ferdous et al. (2010) approximate the likelihood of a multivariate ordered probit model as a composition of bivariate ordered probit model likelihoods, which is easily maximized to provide a consistent estimator. Following Wager, Vaida & Kauermann (2007) we show how the marginalized likelihood can be used to com- pute a marginal Akaike Information Criteria (AIC). Using this criteria we adopt a stepwise algorithm to identify spurious components in the additive decomposition of the meteoro- logical effect. The end result is a flexible, smooth and parsimonious representation of the multivariate effect of meteorology on cycling volumes that can be estimated joint with other aspects of the Poisson model. Only by carefully controling for the complex nonlinear impact of weather and season can reliable estimates of the cross-elasticities of demand with respect to petrol prices be obtained. We find that both over-dispersion and intraday serial dependence are present in the hourly volumes. To account for the latter we construct a 14-dimensional copula model for the longitudinal vector of hourly volumes during the day. Copula models are multivariate models that can be constructed from arbitrary marginal models, with any dependence captured by a copula function. Copula modeling is increasingly popular throughout the physical and social sciences, and has been employed previously in transport studies by Bhat & Eluru (2009) and Sener et al. (2010). In our study the margins are the over-dispersed Poisson regression 3 models, while a Gaussian copula function is used to account for the intraday dependence. The Gaussian copula function is used here because it allows for both positive and negative dependencies between counts at all hours, and can be used in higher dimensions unlike many alternatives. The end result is a multivariate longitudinal Poisson model for dependent counts, where the semiparametric exogenous effects have forms that vary over each hour. As highlighted by Danaher & Smith (2011) it is not widely appreciated that many existing latent variable based count models can be viewed as special cases of a copula model. For example, Song (2000) expresses the multivariate probit model as a Gaussian copula. Similarly, the correlated count model of Herriges et al. (2008) is an example of a Gaussian copula model. As discussed in Danaher & Smith (2011) because the margins are discrete-valued, es- timation of the dependence parameters of the copula using direct maximum likelihood is computationally impractical for a 14-dimensional Gaussian copula. We therefore follow Pitt, Chan & Kohn (2006) and Danaher & Smith (2011) and estimate the Gaussian copula us- ing data augmentation and Bayesian Markov chain Monte Carlo (MCMC). The approach of estimating the marginal models first, and then the copula parameters in a second step, is widely employed in the copula literature; see Cherubini, Luciano & Vecchiato (2004) and Silva & Lopez (2008) for discussions. Estimates of pairwise Spearman correlations reveal strong first order serial dependence between the hourly volumes, along with additional de- pendence between volumes during the morning and evening commuting peaks. The latter is a ‘return trip’ effect evident in the counts, but caused at the individual level where people who commute by bicycle to work are more likely to return by the same mode of transport than would otherwise be the case. To show the flexibility of the copula model we compute the resulting distribution of total daily volumes for any given day, and also the forecasts of evening volumes given those observed in the morning. The paper continues by outlining the data, and then the over-dispersed Poisson model. Multivariate penalized splines are discussed, along with their interpretation as a generalized linear mixed model. We show how to estimate the smoothing parameters using the marginal- 4 ized likelihood, which is derived in greater detail in Appendix A. Using the marginalized likelihood we also define the marginalized AIC criteria. The intraday analysis of the effects of weather, season and petrol prices on bicycle commutes follow, along with an outline of the copula model and the rich intraday serial dependence structure uncovered. Appendix B provides a summary of the Bayesian MCMC scheme used to estimate the Gaussian copula models, while a discussion concludes. 2 Data 2.1 Bicycle Usage in Melbourne The cycling data we examine were obtained from VicRoads, which is a branch of the state government of Victoria. As well as responsibility for the management of the state road net- work, the organisation oversees other transport initiatives.3 These include the development and implementation of cycling programs throughout Victoria, with the aim of increasing participation in cycling as part of an integrated transport network (Austroads Inc., 2005). Melbourne is the capital of Victoria, with a population of 3.81 million people in 2007, which makes up 73.1% of the state total.4 The city is located at the mouth of the Yarra river with a flat geography which is well-suited to cycling, and possesses an extensive bicycle network. Nevertheless, cycling is still only a minor mode of transport in Melbourne, comprising only 1.5% of total trips to work in 2006 (Bonham & Suh 2008), and the city has a high level of car dependency and a growing traffic congestion problem (Clarke & Hawkins 2005). From 2005 VicRoads began to install inductive loop counters on off-road bicycle paths at key locations surrounding the Melbourne central business district (CBD). These count bicycle volumes with a high degree of accuracy at an intraday level. We examine hourly counts of cyclists at the ten locations where loops were first installed and where the data is most complete. Table 1 lists these locations, along with the orientation of the location 3For further information visit their extensive website at www.vicroads.vic.gov.au. 4These are Australian Bureau of Statistics estimates at 30 June 2007; ABS, ‘Population by Age and Sex, Regions of Australia’ (ABS cat. no. 3235.0). 5 in relation to the CBD and the direct line distance between the loop and the General Post Office (GPO) building located in the centre of the CBD. The number of bicycle trips is low both between 19:00 – 06:00 and during weekends, and we focus here only on weekday trips between 06:00 – 19:00 on working days, which are dominated by daily commutes. Table 1 lists mean hourly counts during these hours, and also broken down by the morning peak period 06:00 – 09:00 and evening peak period 16:00 – 19:00. Also listed are the number of observations and time frame over which the data are available.5 —–Table 1 about here—– 2.2 Spatial Variation Melbourne has a radial transport network (Clarke & Hawkins 2005), and the loops have been located to count commuting trips to and from the CBD and immediate locale. There is a strong segmentation of the city suburbs into four quadrants radiating from the CBD, which roughly correspond to the four directions of the compass. This division is based upon topology, patterns of urban development and local government boundaries, and is associated with substantial differences in the socio-economic status of residents. The southern and eastern suburbs are more affluent than the northern and western suburbs, with average incomes of residents up to 50.5% higher (ABS 2005).6 Each loop is located on an off-road cycle path which radiates between the CBD and from suburbs in one of the four quadrants, as listed in Table 1. Using this information, we create several aggregate counts based upon spatial location. The first two are based upon the direction of origin (the south and east suburbs, SE, and the northern suburbs, N), while a third is for inner city loops (IC) closest to the CBD. In creating these counts, we exclude 5We exclude the annual ‘Ride to Work Day’ and the days during the 2008 Melbourne Commonwealth Games. 6The 2005 average total income of wage earners for the Melbourne ABS statistical regions are: Western Melb. $42,358; Moreland City (Inner North) $40,907; Northern Middle Melb. $42,985; Boroondara City (Inner East) $61,555; Eastern Middle Melb. $45,731; Southern Melb. $51,695. 6 the three loops with the highest proportion of missing data (numbers 1, 4 and 8), as well as omitting days where any of the component loops in the cumulative counts were missing. Last, we also create a city-wide count (ALL) at the seven loops with the most complete data. Summaries for these four cumulative counts are also provided in Table 1. 2.3 Seasonality and Time of Day Figure 1 contains boxplots of the hourly city-wide counts (ALL) broken down by both season and time of day. The four panels correspond to the southern hemisphere seasons of Spring (September to November), Summer (December to February), Autumn (March to May) and Winter (June to August). Seasonal variation is apparent, with lower volumes being observed in winter compared to summer. Counts vary strongly according to the time of day, and peak during the morning (07:00 – 09:00) and evening (16:00 – 19:00) commuting periods. —–Figures 1 and 2 about here.—– 2.4 Meteorology and Cost of Fuel Even a cursory examination of this data reveal that meteorological conditions are a key determinant of bicycle commutes. For example, Figure 2 plots the hourly counts at the St Georges Road loop on the 21 November in both 2006 and 2007. Both days were working weekdays, yet counts in 2007 were substantially lower than in 2006. This is likely due to meteorological variation, with it being cool and raining between 07:00 and 19:00 on this day in 2007, yet hot and dry on 2006. The relationship between weather conditions and the propensity to cycle is indiscernible in previous studies that use either individual level survey data (Shannon et al. 2006; Hunt & Abraham 2007), or data that is highly aggregated with respect to time or location (Nankervis 1999; Rietveld & Daniel 2004). Hourly data was obtained on meteorological variables from the Australian Bureau of Mete- orology observed at the three Melbourne metropolitan weather stations located closest to the 7 loops.7 The variables considered were temperature (TEMP), humidity (HMD), windspeed (WIND) and rainfall (RAIN). To account for a small number of missing observations, and minor spatial variation, we employ average measurements across the three stations. The manner in which meteorological conditions affect the decision to cycle to work (rather than switch to other modes of transport) is complex, and to help capture this we introduce two additional variables. The first is a measure of recent rainfall (RRAIN), which is the to- tal rainfall in the current and two previous hourly periods, while the second is the daily maximum temperature (MTEMP). The petrol price data we examine are average weekly pump prices of unleaded petrol ob- served at retail outlets in the Melbourne metropolitan region.8 Unleaded petrol is by far the largest category of fuel sold in Australia, and has a price that is almost perfectly correlated with that of other fuels, such as diesel, LPG and higher octane unleaded petrol. Figure 3 plots the average real weekly petrol price in July 2008 Australian dollars9 (PETROL) and the total weekday counts at all loops. The linear correlation between real petrol prices and total daily counts is statistically insignificant, with any relationship being hard to determine because of the very high variance in counts. Last, we note here that while the automobile is by far the most popular mode of transport in Melbourne, the city also has an extensive integrated public transport system with a single ticketing system. However, the real price of public transport usage remained constant during the period 10 and we therefore do not include this in our model. —–Figure 3 about here—– 7The stations were Essendon Airport, Viewbank and Moorabbin Airport. Melbourne CBD was not used because its measurements are known to be unreliable due to the surrounding high rise buildings. 8This data are constructed by the Australian monitoring organization FuelTrac; see www.fueltrac.com.au. 9The price deflator used was the Australian Bureau of Statistics quarterly consumer price index for all groups. 10Prices were increased inline with inflation three times during the period: on 1 January 2006, 3 June 2007 and 1 January 2008. 8 3 Modelling Usage 3.1 Poisson Regression Model We model counts separately at each hour of the day for each loop, or cumulative count. Denoting yi as the ith observation of an hourly count, for i = 1, . . . ,n days, we employ an over-dispersed Poisson regression model with moments E(yi|ηi) = μi = exp(ηi) , Var(yi|ηi) = μiφ, (3.1) and φ ≥ 1 an unknown dispersion parameter (Wedderburn 1974). Estimation is undertaken using the Fisher scores (see McCullagh & Nelder 1989; p.40) ỹi = ηi + (yi − μi)/μi , for i = 1, . . . ,n, (3.2) where E(ỹi) = ηi and Var(ỹi) = φ/μi. Following many previous authors we compute inference using quasi-maximum likelihood (QMLE) assuming normality for ỹi, so that ỹi ∼ N(ηi,φ/μi). (3.3) Fisher scoring proceeds iteratively by (i) calculating ỹ = {ỹ1, . . . , ỹn} and (ii) fitting η = {η1, . . . ,ηn} using the Gaussian likelihood at equation (3.3) for given ỹ; see Wolfinger & O’Connell (1993) for a discussion. We model the mean as ηi = z ′ iα + s(ti) + m(xi) , (3.4) where zi is a vector of linear effects with coefficients α, s(ti) is a smooth seasonal effect with respect to the time of year 0 ≤ ti < 1, and m(xi) is an unknown multivariate function of the six meteorological variables xi = (xi1, . . . ,xi6). This is an extension of the generalized 9 linear model (Wood 2006; Chapter 3) to include flexible semiparametric components s and m. In the linear component we include an intercept, linear time trend, five day type dummy variables for Tuesday to Friday and for school holidays. We also include log(PETROL), so that the cross-price elasticity of the mean ∂μi ∂PETROLi × PETROLi μi = αP is constant, resulting in a total of 8 linear terms. While it is possible to also model the time trend as semiparametric, we do not do so because it would only be identifiable from the seasonal component by the smoothness parameter of s. Similarly, the elasticity can also be modelled as semiparametric, but we do not do so here because it is both a weak effect, and is harder to interpret if nonlinear. The effect of weather upon counts is complex and highly nonlinear. To simplify the problem we assume additivity up to bivariate interactions between meteorological variables, so that m(xi) = 6∑ j=1 mj(xij) + 6∑ j=1 ∑ l>j mjl(xij,xil) , (3.5) where mj and mjl are univariate and bivariate smooth functions, respectively. Similar ad- ditivity assumptions have been used to model the effect of meteorological variables upon ozone levels (Shively & Sager 1999) and intraday electricity demand (Panagiotelis & Smith 2008). The additive decomposition at equation (3.5) involves the estimation of 21 unknown functions and we identify a more parsimonious representation in a data-based fashion. Let I = {1, 2, . . . , 6} be the index set of the meteorological variables. We denote with S ⊂ I the set of indices where mj are nonlinear functions, and with B = {(i,j); i �= j, i ∈ S, j ∈ S} the set of non-zero bivariate functions. Further denoting the model M = {S, B}, we define 10 a parsimonious representation of equation (3.5) as m(xi; M) = ∑ j∈I xijβj + ∑ j∈S mj(xij) + ∑ (j,l)∈B mjl(xij,xil). (3.6) 3.2 Penalized Splines We discuss determination of M in Section 3.4, but outline here how the unknown semi- parametric components in equation (3.6) can each be expressed as penalized splines for given M. We represent each unknown function as a linear combination of thin plate spline ba- sis terms (Wahba 1990, pp.30–34), which also corresponds to the popular cubic smoothing spline basis for the univariate functions. We also use the same basis for the seasonal effect s in equation (3.4), but with periodicity enforced on the basis terms so that s(0) = s(1). The functions are therefore s(ti) = b̃s(ti) ′ṽs , mj(xij) = b̃j(xij) ′ṽj , and mjl(xij,xil) = b̃jl(xij,xil) ′ṽjl , (3.7) for j ∈ S and (j, l) ∈ B. The vectors b̃s, b̃j, b̃jl are bases evaluated at the ith observation, and ṽs, ṽj, ṽjl are coefficient column vectors for which we employ quadratic penalty terms. For the univariate functions, this penalty is equivalent to the integrated squared second order derivative of the function (Wahba 1978; Green & Silverman 1994, p.13). For the bivariate functions we penalize the sum of the integrated squared elements of the Hessian matrix (Wood 2003). The knots of the thin plate basis terms are located at all unique observations of the indepen- dent variables, so that there can be up to n terms in the basis. To reduce the computational burden we follow Hastie (1996) and Wood (2003) and employ low rank smoothing. For each function, this involves a spectral decomposition of the design matrix for the basis decompo- sition. The k eigenvectors corresponding to the k largest eigenvalues are used as a reduced 11 rank basis, so that we redefine the functions in equation (3.7) as s(ti) = bs(ti) ′vs , mj(xij) = bj(xij) ′vj , and mjl(xij,xil) = xijxilβjl + bjl(xij,xil) ′vjl . (3.8) Here, βjl is an unpenalized linear interaction coefficient and vs,vj,vjl are basis coefficient k-vectors for the reduced rank basis terms bs,bj,bjl. The reduced rank basis coefficients have quadratic penalties v′sDsvs, v ′ jDjvj and v ′ jlDjlvjl, where Ds,Dj and Djl are diagonal matrices containing the k largest eigenvalues for the spectral decomposition of each basis. Such an approach is similar to that advocated by Eilers & Marx (1996), Ruppert, Wand & Carroll (2003), Lang & Brezger (2004) and others who directly select k basis terms and an associated quadratic penalty with non-diagonal matrices. We follow Wood (2006, p.161) and set k = 10 for the univariate functions and k = 30 for the bivariate functions, and there is substantial evidence that selecting k larger has little influence on the fit; see Ruppert (2002) or Kauermann & Opsomer (2011) for discussions. Let λs,λj,λjl be smoothing parameters for functions s,mj,mjl, respectively. For model M, let θM denote both the basis term coefficients and linear terms for all components in equations (3.4), (3.6) and (3.8), and λM be the set of all smoothing parameters. Then the penalized quasi-log-likelihood takes the form lp(θM,λM) = l(θM) − 1 2 ∑ j∈S λjv ′ jDjvj − 1 2 ∑ (j,l)∈B λjlv ′ jlDjlvjl − 1 2 λsv ′ sDsvs , (3.9) where l(θM) = ∑n i=1 li(θM) is the unpenalized quasi-log-likelihood from equation (3.3) for a given model M. Conditional upon the smoothing parameters λM, maximization of equa- tion (3.9) with respect to θM can be undertaken via Fisher scoring as discussed in Section 3.1. The estimation of λM and selection of M is discussed below. 12 3.3 Generalized Linear Mixed Model Following Wahba (1978), Wong & Kohn (1996), Wood (2003) and others we interpret the penalties in equation (3.9) as Gaussian priors for the reduced rank basis term coefficients, so that vs ∼ N(0,λ−1s D−1s ), vj ∼ N(0,λ−1j D−1j ) and vjl ∼ N(0,λ−1jl D−1jl ) , (3.10) thereby reformulating the problem as a GLMM. Estimation proceeds by maximizing the log- arithm of the marginal likelihood lm(α,βM,λM; M) obtained by integrating out the reduced rank basis coefficients. This integration can be undertaken based on the Laplace approxima- tion as suggested by Breslow & Clayton (1993) and outlined in more detail in Appendix A. Asymptotic results supporting this approach in the penalized spline smoothing context can be found in Kauermann, Krivobokova & Fahrmeir (2009) and Rue, Martino & Chopin (2009). Maximization of the Laplace approximation of lm with respect to {α,βM,λM} is pursued as part (ii) of the Fisher scoring algorithm outlined in Section 3.1 using the standard steps for a linear mixed model (LMM) listed in Harville (1977); also see Appendix A for more details. Re-calculation of ỹ1, . . . , ỹn and refitting the LMM provide the essential steps of the Fisher scoring algorithm. The smoothing parameters λM are reciprocals of prior variances which are estimated following optimisation. The procedure is implemented numerically in regular statistical software, where we make use of the ‘gamm’ procedure provided in the package ‘mgcv’ in the statistical language ‘R’.11 Finally, the semiparametric function estimates can be computed via penalized least squares estimation of the basis term coefficients in equa- tion (3.9). The benefits for writing the model in the mixed model framework are twofold. First, we directly obtain estimates for the smoothing parameters λM, thereby enabling com- putation of the function estimates. Second, we can also use the mixed model framework for model selection as discussed below. 11Implementation of GLMMs and smoothing in R is discussed by Wood (2006). 13 3.4 Model Selection The index set M = {S, B} determines the functional form of the meteorological effect m in equation (3.6), and we determine this from the data. We use the AIC (Akaike 1974) computed using the marginal likelihood; a criterion that is labelled marginal AIC by Vaida & Blanchard (2005) and Wager, Vaida & Kauermann (2007), and has been recently extended to GLMMs by Lavergne, Martinez & Trottier (2008). For model M, let lm(α̂, β̂M, λ̂M; M) be the maximum of the logarithm of the (Laplace approximation) to the marginal likelihood at step (ii) of the Fisher scoring algorithm. Then we define the marginal AIC as mAIC(M) = −2lm ( α̂, β̂M, λ̂M; M ) + 2q , (3.11) where q is the number of components (linear or nonlinear) in the model at equation (3.6). We could calculate mAIC for all possible models M, but we constrain the search strategy as follows. First, for numerical stability we do not consider models with bivariate effects in the pairs (RAIN,RRAIN) and (TEMP,MTEMP) because they are highly correlated; the latter particularly during afternoon hours. Second, to reduce the computational requirements, we use a stepwise algorithm to traverse the space of possible models. We start from the model S = {1, . . . , 6} and B = ∅, and in the forward selection step we successively include components in B that reduce mAIC. In the backward selection step we successively reduce the smooth components in set S (but only those which are not also a component in B) if this reduces mAIC. 4 Empirical Analysis 4.1 Determinants of Bicycle Usage We estimate the Poisson model for counts observed at all sites, and also for the four cumulative counts. Substantial over-dispersion was found throughout, particularly during 14 the evening peak period. For example, Table 2 reports the estimates of φ for ALL, which vary from a minimum of 3.4 at 06:00 to a maximum of 13.7 at 17:00. Table 2 also reports the estimated time trends for the four cumulative counts. There is a strong underlying upwards trend in usage throughout the period, particularly during the morning and evening commuting periods and for trips originating from the northern and inner city suburbs. —–Table 2 about here—– Table 3 summarises the models identified with the maximum mAIC values for the cumu- lative count ALL at each time of the day. For each univariate meteorological variable we denote the linear effects with ‘L’ and nonlinear effects with ‘S’; non-zero bivariate interaction effects are denoted with ‘B’. A substantial degree of parsimony is identified, with only a few of the bivariate interaction effects non-zero, and many of the univariate components linear. Similar parsimonious representations are also determined for the other count variables. During the early morning peak (06:00 – 08:00) the effect of meteorology is well represented as additive in the six univariate components, with a nonlinear MTEMP effect throughout. Figure 4 plots the estimated meteorological effects for these three morning periods. The most important variables (as measured by function range) are MTEMP, RAIN, RRAIN and WIND, with the effects largely consistent during the three hours. Interestingly, it is not current air temperature, but daily maximum temperature, which usually occurs later between 14:00 and 17:00 in Melbourne, that proves important. In Section 5.2 we also show that there is strong dependence between cycling volumes during the morning and evening peak commuting periods. This is likely to be induced by strong dependence between mode of transport choice for the commuting trips to and from the workplace. Individuals therefore appear forward-looking in their decision-making and consider the forecast daily maximum temperature before selecting cycling as a mode of transport for the morning commute to work. A maximum temperature between 25 and 35oC is optimum, with less bicycle trips undertaken when MTEMP is outside this range. Higher levels of recent rainfall RRAIN 15 results in fewer cyclists; a similar effect is found with windspeed. Confidence bands are also given for the estimates at 06:00, which are similar to those for other times of the day. Estimates of the functional forms of RAIN and RRAIN have wide confidence intervals because of the small proportion of non-zero observations (9.9% and 17.8%, respectively). During the evening peak period (16:00 – 19:00) the effects of current air temperature TEMP and recent rain RRAIN are both nonlinear and more pronounced than during the morning hours. This is likely because individuals are more backward-looking in their choice of mode of transport for the return commute. Figure 5 plots the nonlinear effects for the main evening commuting hour of 17:00, including the seasonal component. Recent rain has a substantial negative impact on propensity to cycle, whereas current rain has a linear coefficient with a p-value of 0.421; again, suggesting individuals are more backward-looking in their decision- making process. Windspeed is still important, but is now linear with a negative coefficient and p-value of 0.000. Overall, while the impact of meteorology on bicycle commuting is complex, there are some main observations. Cyclists are deterred by maximum temperatures outside the comfortable region of 25–35oC, high windspeeds and rainfall. However, in the morning they anticipate the weather conditions for the rest of the day because once they have made the decision to cycle to work, they are more likely to return by this mode of transport. In the evening, they are more backwards-looking and make the decision to cycle partially based upon the weather conditions revealed during the day. —–Table 3 and Figures 4 and 5 about here—– 4.2 Cross-Price Elasticities Table 4 provides the estimated elasticities α̂P for all loops at all times of the day. Four loops stand out as having significantly positive elasticities during the high volume commuting periods. These are loops 3, 7, 8 and 9, which are all inner city loops and include two of the 16 three highest volume loops in the data. This suggests that higher petrol prices are associated with an increase in bicycle trips recorded at these loops, presumably as commuters substitute from driving to cycling. We note that loop 10 has negative elasticities during the commuting period. This loop is located on the beachfront 6.5km away from the CBD and is likely to have the highest proportion of recreational trips. These are likely to have quite different determinants of usage than the commuting trips that are predominantly counted by the other loops. To examine further the spatial variation in elasticities, Table 5 provides the estimated elasticities for the cumulative counts at different times of the day. There is some evidence that commuters from the less affluent northern suburbs (N) are switching into cycling as a mode of transport as petrol prices rise, particularly in the early morning. However, there is stronger evidence that this is occuring during both the morning and evening commuting peaks with cyclists from wealthier suburbs (SE). The largest spatial variation in elasticities is associated with the distance of the loop from the city centre, with positive elasticities at all hours of the day for inner city (IC) loops that are significant throughout commuting hours. Presumably this is because cycling is a more attractive alterative to driving for trips to and from inner city suburbs and the CBD. This is consistent with Xing et al. (2010) who find shorter trip distance is a key factor for higher rates of adoption of bicycle transportation in six small US cities. On a city-wide basis there is a meaningful petrol price effect with all elasticities being positive for the cumulative count ALL, which are significant during the commuting periods. To benchmark these results, we also estimate a log-linear model with the same mean struture and in the same manner, but without Fisher scoring. The estimated elasticities are also found in Table 5 and they coincide with those from the Poisson model. —–Tables 4 and 5 about here—– 17 4.3 Diagnostics To assess the fitted models we employ the standardised Pearson residuals ̂i = (yi − μ̂i)/ √ μ̂iφ̂ for i = 1, . . . ,n, where μ̂i and φ̂ are the point estimates from the best model M identified using mAIC. Following equation (3.3) the residuals should be approximately standard normal. Quantile plots suggest this to be the case and Table 6 contains p-values for the Kolmogorov-Smirnoff test of normality of the residuals for all cumulative counts at all times of day. The hypothesis that the residuals are standard normal can only be rejected at the 1% significance level in one of the 100 circumstances, suggesting the over-dispersed Poisson model is well-calibrated to the data. To quantify the proportion of variation explained we use the statistic R2full = 1 − n∑ i=1 ̂2i / n∑ i=1 ̂20i , where ̂20i are the standardised Pearson residuals in the null model with intercept only. To assess the relative influence of different components we calculate the R2 statistic using partial residuals on the numerator; that is, the residuals obtained from a fit when excluding specific components from ηi. The resulting full and partial R 2 statistics are shown in Figure 6 for the four cumulative counts where we exclude (i) trend and season; (ii) day type; (iii) petrol price; and (iv) meteorological components. The model explains around 80-90% of the variation in counts during peak hours, and 60-70% of variation during the midday period. The meteo- rological effects dominate, particularly during the midday period, followed by the seasonal and day type components. Petrol prices are the weakest of the four effects, and it would be difficult to determine its impact at an intraday level without controlling for the other effects. —–Table 6 and Figure 6 about here—– 18 5 Intraday Serial Dependence 5.1 Copula Model To account for intraday serial dependence in the counts we construct a multivariate model with m = 14 dimensions, each corresponding to an hourly count. We employ a Gaussian copula to capture dependence between the counts, and use the estimated hourly Poisson re- gression models as the marginal distributions. For hour j and observation i, an over-dispersed Poisson regression model with moments specified at equation (3.1) has a closed form prob- ability mass function Pr(Yij = yij) = fij(yij; μij,φj) given in Cameron & Trivedi (1986). 12 The joint distribution function of all 14 counts is F(yi1, . . . ,yim) = Φm(Φ −1 1 (ui1), . . . , Φ −1 m (uim); C) = Φm(y � i1, . . . ,y � im; C) . Here, Φm is the distribution function of an m-dimensional Gaussian distribution with zero mean and correlation matrix C, while Φ1 is the distribution function of a standard uni- variate normal. We stress here that adopting a Gaussian copula model does not mean the counts are normally distributed; instead they follow a multivariate discrete distribution where the matrix C captures the dependence between the margins. As discussed in Danaher & Smith (2011), because the marginal distributions are discrete, y�ij is only known up to the bounds Φ−11 (Fij(yij − 1; μij,φj)) < y�ij < Φ−11 (Fij(yij; μij,φj)) . (5.1) The bounds can be computed using the estimates of the marginal parameters μij,φj, count data yij and the marginal distribution functions Fij(yij; μij,φj) = ∑yij w=0 fij(w; μij,φj). 12Cameron & Trivedi (1986) define a specific Poisson-Gamma mixture model which they call ‘negative binomial type 1’ which has the moments of an over-dispersed Poisson model. This is not to be confused with the regular negative binomial model that Cameron & Trivedi call ‘negative binomial type 2’. 19 Estimation of a Gaussian copula model with discrete margins is known to be a challenging problem (Song 2000). Direct maximum likelihood is computationally infeasible because evaluating the log-likelihood function is an O(2m) operation, which is impractical for m = 14. However, Pitt, Chan & Kohn (2006) and Danaher & Smith (2011) outline Bayesian analyses based on MCMC sampling schemes that treat y� = {y�ij; i = 1, . . . ,n,j = 1, . . . ,m} as latent and generate their values explicitly, along with the correlation matrix C. Such an approach is widely called ‘Bayesian data augmentation’, and is popular for the estimation of multivariate probit models (Chib & Greenberg 1998). We summarise the method in Appendix B, but refer the reader to Danaher & Smith (2011) for a complete discussion. 5.2 Empirical Analysis We estimate the Gaussian copula for the cumulative count ALL in Table 1 using the Bayesian method discussed above. Because the copula model specifies the complete dis- tribution F , a wide range of output can be obtained from the fitted parametric distribu- tion to assess the level and impact of intraday serial dependence in the counts. Danaher & Smith (2011) outline how to compute pairwise Spearman’s correlations ρsj,k = 12E(uijuik)−3 between counts yij and yik, where the expectation is evaluated with respect to the posterior distribution of C and y�. Figure 7(a) plots the Spearman correlations between all pairwise combinations of the hourly counts. Positive dependence ρsj,j+1 can be seen between adjacent hours j and j + 1. This first order serial dependence is likely due to the omission from the model for μij of additional determinants of usage that are themselves serially dependent. Interestingly, there is also positive dependence between trips in the morning and evening peaks periods. For example, the correlation between the peak hours of 08:00 (j = 3) and 17:00 (j = 12) is ρs3,12 = 0.43. This is evidence of a ‘return trip’ effect, where individuals who have already commuted to work by bicycle in the morning are more likely than otherwise would be the case to also return by bicycle, rather than switch to another mode of transport. To illustrate the extent of this return trip effect, we compute the expected count during 20 the evening peak at 17:00, conditional upon that in the morning peak at 08:00. That is, E(y12,i|y3,i) = ∫ E(y12,i|y3,i,y�,C)π(y�,C|data)d(C,y�), where (y�,C) is integrated out with respect to the posterior distribution π(y�,C|data). 13 Figure 7(b) plots this for 1 and 2 April 2008, and on both days there is a strong positive relationship between morning and expected evening counts. The observed total daily counts for 1 and 2 April 2008 are plotted as vertical lines on Figure 7(d). Also plotted is the distribution of total daily counts ytotali = ∑14 j=1 yij from the fitted copula model for both days, and the actual observations are within the support of the distribution. On 1 April the total count was was high at 9649, but was much less on 2 April at a low of 4871. The reason is the advent of bad weather on 2 April, particularly with the strongest winds observed during the entire sample period. The average windspeed between 12:00 and 16:00 was 50.1km/h and was 38.4km/h during the entire 14 hours. Last, Figure 7(c) plots the posterior mean of the partial correlation matrix arising from C. These are the partial correlations of the latent variables y�i = (y � i1, . . . ,y � im) and again suggests strong first order intraday serial dependence. 6 Discussion This paper presents an empirical study of bicycle commuting in a major western urban environment that is typical of many of today’s car-dependent western cities. The data are unique in that they are both spatially disaggregate and, in particular, observed intraday. Previous studies are either based on survey data or highly aggregated with respect to time and/or location, which makes it hard to account for the impact of meteorology. This can result in the misnomer that weather conditions are a minor factor in the decision to cycle (Nankervis 1999; Hunt & Abraham 2007; Wardman et al. 2007), whereas our empirical work finds otherwise. 13This is evaluated in a Monte Carlo fashion by simulating 100 iterates from the copula model at each sweep of the sampling scheme, which is fast for a Gaussian copula; see Danaher & Smith (2011). 21 Our analysis confirms a strong diurnal variation in all aspects of the model, with sharp differences between the morning peak, evening peak and intermediate hours. We find that meteorological conditions are the strongest determinant of bicycle volumes throughout most of the day, followed by season and day type. The form of the effect is a complex mul- tivariate function of six key meteorological variables, which are represented here using a parsimonious penalized spline decomposition. There is evidence that morning commuters are more forward-looking with regards to weather conditions, and evening commuters more backward-looking. The extreme volatility in petrol prices between 2005 and 2008 also provides something akin to a natural experiment, allowing estimation of the cross-elasticities. In response to high petrol prices we find significant substitution into cycling as a mode of transport in the inner city, wealthy neighbourhoods and in the city overall, particularly during the peak commuting periods. As far as we are aware, the results in this study provide the first non-anecdotal evidence of such a phenomena in a major western urban environment. This relationship would be difficult to ascertain without controlling for season and weather conditions in a flexible fashion at an intraday resolution. The results suggest that policies aimed at encouraging bicycle commutes have a higher chance of success when the price of petrol is high, and in an inner city environment, where commuting trips are of shorter distance. Using a Gaussian copula the model is extended to a multivariate model for longitudinal count data. Unsurprisingly, there is strong serial dependence in the hourly counts. More interestingly, even though the data is aggregate with respect to individuals, the results sug- gest a return trip effect at the individual level. If someone commutes to work in the morning by bicycle, then the late afternoon commute is more likely to also be by bicycle than would otherwise be the case. By fitting a copula model a range of additional inference is also avail- able. This includes intraday forecasts of expected evening peak volumes, conditional upon those observed in the morning peak. Such forecasts have potential in transport management systems. 22 Last, we reiterate here that a two-stage estimator is used in Section 5. The first stage employs an approximate marginalized likelihood approach for estimating complex semipara- metric count data models for each hour. The second stage is a Bayesian method to estimate a Gaussian copula to provide a 14-dimensional multivariate count data model. Because no strong priors are adopted, estimates based on the posterior distribution are very similar to those based only on the likelihood function, the latter of which cannot be employed directly because it is computationally intractable. Our approach is therefore analogous to two-stage maximum likelihood, which is a popular approach for the estimation of copula models (Cheru- bini et al. 2004). Fully Bayesian estimation, such as that employed in Herriges et al. (2008) of a linear demand system using multivariate count data, is difficult because of the com- plexity of also undertaking multivariate smoothing. To achieve this, it would be necessary to embed Bayesian multivariate smoothing methodology, such as that suggested by Wong & Kohn (1996), Smith & Kohn (1997), Lang & Brezger (2004) or Panagiotelis & Smith (2008), within the discrete-margined Gaussian copula framework, which is difficult computationally. Acknowledgments The work of Michael Smith was partially supported by Australian Research Council grant DP0985505, while Göran Kauermann acknowledges support provided by the Deutsche Forschungs- gemeinschaft (DFG, project “Funktionale Modelle bei zeitstrukturierten Daten”). The au- thors would particularly like to thank VicRoads for providing the bicycle usage data. They would also like to thank Andrew John, Doug Dow and participants at a Melbourne Business School seminar for useful comments. 23 A Appendix In this appendix we provide further details on how to compute the marginal likelihood and estimate the GLMM in Section 3.3. We write the log-likelihood contributions li(θM) in equation (3.9) in terms of the deviance (see McCullagh & Nelder, 1989, p. 197), so that li(θM) = −d(yi,μi)/(2φ), where d(y,μ) = −2 ∫ μ y (y − u)/u du and μi is the mean in equa- tion (3.1). For a given model M, we denote the concatenated design matrix of the nonlinear terms from the reduced rank bases as BM, and the corresponding concatenated basis co- efficients as vM. The prior vM ∼ N(0,D−1M ), where DM is a block diagonal matrix with diagonal matrices λsDs, λjDj, j ∈ S, and λjlDjl, (j, l) ∈ B upon the leading diagonal. The logarithm of the marginal likelihood is lm(α,βM,λM; M) = log ∫ exp { − 1 2φ ∑ i d(yi,μi) } ψ(vM,D −1 M)dvM , (A.1) where ψ(x,V ) denotes the multivariate normal density with zero mean, variance matrix V evaluated at point x. The integral in equation (A.1) is not tractable analytically, which has led to the use of the Laplace approximation; see Breslow & Clayton (1993). Rue et al. (2009) show that this Laplace approximation works well when computing posterior inference, while Kauermann et al. (2009) demonstrate the effectiveness of the Laplace approximation in penalized spline smoothing. Applying the Laplace approximation yields lm(α,βM,λM; M) ≈ − 1 2 log |I + B′MM�BMD−1M | − 1 2φ ∑ i d(yi,μ � i ) − 1 2 v�M ′DMv � M . (A.2) Here, M� = diag(μ�1, . . . ,μ � n), μ � i = exp(z ′ iα + x̃ ′ iβM + b ′ M,iv � M), x̃i are the linear terms from the basis expansions, b′M,i is the ith row of BM and v � M is the maximizer of the last two components in equation (A.2). The deviance is approximated using the squared Pearson residuals (yi −μ�i )2/μ�i as suggested in Pierce & Schafer (1986), see also McCullagh & Nelder (1989, p. 197). Note that (yi − μ�i )/μ�i = ỹi − log(μ�i ), with ỹi the Fisher score defined in 24 equation (3.2). The Pearson residuals can be used to further approximate the (maximized) logarithm of the marginal likelihood (see Breslow & Clayton, 1993 or Wolfinger & O’Connell, 1993) as lm(α̂, β̂M,λM; M) ≈ − 1 2 log |VM| − 1 2 (ỹ − Zα̂ − Xβ̂M)′V −1M (ỹ − Zα̂ − Xβ̂M) , (A.3) where VM = (M�−1 +BMDMB′M), Z is a design matrix with ith row zi and X is the design matrix for the linear terms in the basis decomposition of functions s and m. Maximization of equation (A.3) with respect to λM provides the final maximized value of the marginal likelihood. B Appendix In this section we outline the data augmentation approach used to estimate the Gaussian copula. This involves constructing a Markov chain Monte Carlo sampling scheme to eval- uate the posterior distribution of the copula parameter matrix augmented with the latent variables, π(C,y�|data). We follow Danaher & Smith (2011) and employ the decomposition C = diag(Σ)−1/2 Σ diag(Σ)−1/2 , where Σ is a positive definite matrix, and diag(Σ) a diagonal matrix comprised of the leading diagonal of Σ. We further decompose Σ−1 = R′R, with R = {rij} being an upper triangular Cholesky factor with leading diagonal elements all ones. The sampling scheme is: Sampling Scheme Step 1. For j = 1, ...,m and i = 1, . . . ,n: Generate from y�ij|{y�\y�ij},C, data. Step 2. For i = 1, . . . ,m − 1, and j > i: Generate from rij|{R\rij},y� using random walk Metropolis-Hastings. Step 3. Compute C from R. 25 The distribution in Step 1 is a constrained normal distribution, and is given in Danaher & Smith (2011), along with the Metropolis-Hastings acceptance ratio for Step 2. The sampling scheme is run for many sweeps, and after convergence, a Monte Carlo sample from the posterior distribution is obtained. From this sample parameter estimates and other posterior inference can be constructed. 26 References Akaike, H., 1974. A new look at the statistical model identification. IEEE Transactions of Automatic Control, 19 (6), 716–723. Australian Bureau of Statistics, 2005. National Regional Profile (by Statistical Local Area). viewed 1 Sept. 2009, www.abs.gov.au. Austroads Inc., 2005. The Australian National Cycling Strategy 2005-2010. viewed 22 May 2009 at www.vicroads.vic.gov.au. Bonham, J. & Suh, J., 2008. Pedalling the city: intra-urban difference in cycling for the journey-to-work. Road & Transport Research, 17 (4), 25–40. Breslow, N.E. & Clayton, D.G., 1993. Approximate Inference in Generalized Linear Mixed Model. Journal of the American Statistical Association, 88 (421), 9–25. Bhat, C.R. & Eluru, N., 2009. A copula-based approach to accomodate residential self- selection in travel behavior modeling. Transportation Research Part B, 43 (7), 749–765. Cameron, A. & Trivedi, P., 1986. Econometrics Models Based on Count Data: Comparisons and Applications of Some Estimators and Tests. Journal of Applied Econometrics, 1 (1): 29–53. Cherubini, U., Luciano, E., and Vecchiato, W., 2004. Copula Methods in Finance, Chich- ester, England: Wiley. Cottet, R. & Smith, M., 2003. Bayesian Modeling and Forecasting of Intraday Electricity Load. Journal of the American Statistcal Association, 98 (464), 839–849. Chib, S. & Greenberg, E., 1998. Analysis of multivariate probit models. Biometrika, 85 (2), 347–361. 27 Clarke, H. & Hawkins, A., 2006. Economic Framework for Melbourne Traffic Planning. Agenda, 13 (1), 63–80. Danaher, P. & Smith, M., 2011. Modeling Multivariate Distributions using Copulas: Appli- cations in Marketing (with discussion). Marketing Science, 30 (1), 4-21. Eilers, P.H.C. & Marx, B.D., 1996. Flexible smoothing with B-splines and penalties. Statis- tical Science, 11 (2), 89–121. Fiebig, D., Bartels, R. & Aigner, D., 1991. A Random Coefficient Approach to the Estimation of Residential End-Use Load Profiles. Journal of Econometrics, 50 (3), 297-327. Ferdous, N., Eluru, N., Bhat, C.R. & Meloni, I., 2010. A multivariate ordered-response model system for adults’ weekday activity episode generation by activity purpose and social context. Transportation Research Part B, 44 (8-9), 922–943. Green, D.J. & Silverman, B.W., 1994. Nonparametric Regression and generalized linear models, London: Chapman & Hall. Harville, D.A., 1977. Maximum likelihood approaches to variance component estimation and to related problems. Journal of the American Statistcal Association, 72 (358), 320–338. Hastie, T., 1996. Pseudosplines. Journal of the Royal Statistical Society, Series B, 58 (2), 379-396. Heinen, E., van Wee, B. & Maat, K., 2010. Commuting by Bicycle: An Overiew of the Literature. Transport Reviews, 30 (1), 59–96. Herriges, J., Phaneuf, D. and Tobias, J., 2008. Estimating demand systems when outcomes are correlated counts. Journal of Econometrics, 147 (2), 282–298. Hunt, J. & Abraham, J., 2007. Influences on bicycle use. Transportation, 34 (4), 453-470. 28 Kauermann, G., Krivobokova, T., & Fahrmeir, L., 2009. Some asymptotic results on gener- alized penalized spline smoothing. Journal of the Royal Statistical Society, Series B, 71 (2), 487–503. Kauermann, G. & Opsomer, J.D., 2011. Data-driven Selection of the Spline Dimension in Penalized Spline Regression. Biometrika, 98 (1), 225–230. Lang, S. & Brezger, A., 2004. Bayesian P-Splines. Journal of Computational & Graphical Statistics, 13 (1), 183-212. Lavergne, C., Martinez, M.-J. & Trottier, C., 2008. Empirical Model Selection in Generalized Linear Mixed Effects Models. Computational Statistics, 23 (1), 99–110. McCullagh, P. & Nelder, J.A., 1989. Generalized Linear Models, 2nd edition, London: Chapman and Hall. Nankervis, M., 1999. The effect of weather and climate on bicycle commuting. Transporta- tion Research Part A, 33 (6), 417–431. O’Sullivan, F., 1988. Nonparametric estimation of relative risk using splines and cross- validation. SIAM Journal on Scientific Computing, 9 (3), 531–542. Panagiotelis, A. & Smith, M., 2008. Bayesian identification, selection and estimation of semiparametric functions in high-dimensional additive models. Journal of Econometrics, 143 (2), 291–316. Pierce, D.A. & Schafer, D.W., 1986. Residuals in generalized linear models. Journal of the American Statististical Association, 81 (396), 977–986. Pitt, M., Chan, D. & Kohn, R., 2006. Efficient Bayesian Inference for Gaussian Copula Regression Models. Biometrika, 93 (3), 537–554. Pucher, J., Dill, J. & Handy, S., 2010. Infrastructure, programs and policies to increase bicycling: An international review. Preventive Medicine, 50, Supplement 1, S106–S125. 29 Rietveld, P. & Daniel, V., 2004. Determinants of bicycle use: do municipal policies matter?. Transportation Research Part A, 38 (7), 531–550. Rue, H. Martino, S. & Chopin, N., 2009. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society, Series B, 71 (2), 1–35. Ruppert, D., 2002. Selecting the number of knots for penalized splines. Journal of Compu- tational & Graphical Statistics, 11 (4), 735–757. Ruppert, D., Wand, M. & Carroll, R., 2009. Semiparametric regression during 2003-2007. Electronic Journal of Statistics, 3, 1193-1256. Ruppert, D., Wand, M. & R. Carroll, R., 2003. Semiparametric Regression, Cambridge: Cambridge University Press. Sener, I. N., Eluru, B. & Bhat, C.R., 2010. On jointly analyzing the physical activity participation levels of individuals in a family unit using a multivariate copula framework. Journal of Choice Modelling, 3 (3), 1–38. Shannon, T., Giles-Corti, B., Pikora, T., Bulsara, M., Shilton, T. & Bull, F., 2006. Active commuting in a university setting: Assessing commuting habits and potential for modal change. Transport Policy, 13 (3), 240–253. Shively, T. & Sager, T., 1999. A semiparametric regression approach to adjusting for me- teorological variables in air pollution trends. Environmental Science & Technology, 33 (21), 3873–3880. Silva, R., and Lopes, H., 2008. Copulas, Marginal Distributions and Model Selection: A Bayesian Note. Statistics and Computing, 18 (3), 313–320. Smith, M. & Kohn, R., 1997. A Bayesian approach to nonparametric bivariate regression. Journal of the American Statistical Association, 92 (440), 1522–1535. 30 Song, P. X.-K., 2000. Multivariate Dispersion Model Generated from Gaussian Copula. Scandinavian Journal of Statistics, 27 (2), 305–320. Stinson, M. & Bhat, C.R., 2004. Frequency of bicycle commuting: internet-based survey analysis. Transportation Research Record, 1878, 122–130. Vaida, F. & Blanchard, S., 2005. Conditional Akaike information for mixed effects models. Biometrika, 92 (2), 351–370. VicRoads, 2001. Cycling to Work in Melbourne 1976-2001. viewed on 22 May 2009 at www.vicroads.vic.gov.au. Wager, C., Vaida, F. & Kauermann, G., 2007. Model selection for p-spline smoothing using Akaike information criteria. Australian & New Zealand Journal of Statistics, 49 (2), 173–190. Wahba, G., 1978. Improper priors, spline smoothing and the problem of guarding against model errors in regression. Journal of the Royal Statistical Society, Series B, 40 (3), 364–372. Wahba, G. 1990. Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics, Vol. 59, Phildelphia, SIAM. Wardman, M., Tight, M. & Page, M., 2007. Factors influencing the propensity to cycle to work. Transportation Research Part A, 41 (4), 339–350. Wand, M. 2003. Smoothing and mixed models. Computational Statistics, 18, 223–249. Wecker, W. & Ansley, C., 1983. The Signal Extraction Approach to Nonlinear Regression and Spline Smoothing. Journal of the American Statistical Association, 78 (381), 81–89. Wedderburn, R.W.M., 1974. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika, 61 (3), 439–447. Wolfinger, R., & O’Connell, M., 1993. Generalized linear mixed models: A pseudo-likelihood approach. Journal of Statistical Computation & Simulation, 48 (3), 233–243. 31 Wong, C. & Kohn, R., 1996. A Bayesian approach to additive semiparametric regression. Journal of Econometrics, 74 (2), 209–235. Wood, S., 2003. Thin plate regression splines. Journal of the Royal Statistical Society, Series B, 65 (1), 95–114. Wood, S. 2006. Generalized Additive Models, London: Chapman & Hall. Xing, Y., Handy, S. and Mokhtarian, P., 2010. Factors associated with proportions and miles of bicycling for transportation and recreation in six small US cities. Transportation Research Part D, 15 (2), 73–81. 32 L o o p C h a ra ct er is ti cs A v er a g e H o u rl y C o u n ts T im e F ra m e a n d A v a il a b le D a ta D is ta n ce S u b u rb 0 6 :0 0 − 1 6 :0 0 − 0 6 :0 0 − N u m b er o f S ta rt E n d P er ce n t L o o p N o In d u ct io n L o o p N a m e to G P O L o ca ti o n 0 9 :0 0 1 9 :0 0 1 9 :0 0 W o rk d a y s D a te D a te M is si n g 1 S t. G eo rg es R o a d (S u m n er R o a d ) 5 .3 k m N o rt h 8 7 .8 7 8 7 .5 2 6 0 .7 9 5 5 0 1 D ec 0 5 1 7 J u n 0 8 1 1 .6 2 A n n iv er sa ry T ra il 1 9 .0 k m E a st 1 5 .5 9 1 5 .0 8 1 4 .6 9 5 8 4 1 2 D ec 0 5 1 7 J u n 0 8 5 .3 3 M a in Y a rr a T ra il (N o rt h B a n k ) 2 .8 k m E a st 2 0 9 .1 5 2 0 1 .4 6 1 4 1 .7 7 5 7 0 1 3 D ec 0 5 1 9 J u n 0 8 7 .6 4 M a in Y a rr a T ra il (S o u th B a n k ) 2 .9 k m S o u th 9 3 .8 4 9 0 .6 2 6 4 .6 1 5 6 4 1 2 D ec 0 5 1 9 J u n 0 8 8 .7 5 C a n n in g S t. (C a rl to n ) 2 .6 k m N o rt h 1 6 4 .4 6 1 5 7 .1 4 1 1 1 .1 0 5 8 7 1 D ec 0 5 1 9 J u n 0 8 6 .1 6 U p fi el d R a il w a y L in e 4 .1 k m N o rt h 5 9 .2 8 4 9 .7 5 4 0 .2 4 5 8 5 1 2 D ec 0 5 1 9 J u n 0 8 5 .3 7 C a p it a l C it y T ra il (P ri n ce s H il l) 3 .8 k m N o rt h 7 5 .5 7 6 3 .7 9 5 0 .5 7 5 7 8 1 D ec 0 5 1 9 J u n 0 8 7 .5 8 F o o ts cr a y R o a d P a th 1 .7 k m W es t 1 3 5 .9 4 1 4 6 .4 3 9 3 .9 2 5 4 9 1 2 D ec 0 5 2 0 J u n 0 8 1 1 .1 9 T ra m 1 0 9 T ra il 2 .3 k m S o u th 6 4 .4 0 5 8 .9 8 4 5 .0 8 5 8 1 1 2 D ec 0 5 1 8 J u n 0 8 5 .8 1 0 B a y T ra il (S t. K il d a ) 6 .5 k m S o u th 6 9 .2 9 7 3 .6 2 6 0 .1 8 5 8 7 1 D ec 0 5 1 8 J u n 0 8 6 .0 C u m u la ti v e C o u n ts o v e r M u lt ip le L o o p s N N o rt h (L o o p s 5 -7 ) 3 0 0 .0 2 7 1 .4 2 0 2 .3 5 7 6 2 1 D ec 0 5 1 9 J u n 0 8 5 .8 S E S o u th & E a st (L o o p s 2 -3 ,9 ,1 0 ) 3 5 8 .6 3 4 8 .7 2 6 1 .5 5 7 0 1 3 D ec 0 5 1 7 J u n 0 8 9 .1 IC In n er C it y (L o o p s 3 ,5 -7 ,9 ) 5 7 4 .3 5 3 2 .6 3 8 9 .7 5 5 6 2 3 D ec 0 5 1 8 J u n 0 8 8 .7 A L L A ll L o ca ti o n s (L o o p s 2 ,3 ,5 -7 ,9 ,1 0 ) 6 5 9 .0 6 2 0 .5 4 6 4 .2 5 5 4 2 3 D ec 0 5 1 7 J u n 0 8 9 .1 T a b le 1 : L o ca ti o n a n d sp a ti a l ch a ra ct er is ti cs o f in d u ct iv e lo o p co u n te rs , a v er a g e w o rk in g d a y h o u rl y co u n ts , a n d th e ti m e fr a m e a n d d a ta a v a il a b il it y fo r w o rk in g d a y s. S u m m a ri es a re g iv en fo r a ll te n lo o p s, a n d a ls o fo r cu m u la ti v e co u n ts o v er m u lt ip le lo o p s. T h e p er ce n ta g e o f m is si n g is re la ti v e to th e le n g th o f d a ta , a n d a n o b se rv a ti o n is co n si d er ed m is si n g fo r a cu m u la ti v e co u n t w h en o n e is m is si n g a t a n y co m p o n en t lo o p . 33 H o u r 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 E st im a te d T re n d C o e ffi c ie n ts fo r C u m u la ti v e C o u n ts S E 0 .0 9 2 * * 0 .0 8 6 * * 0 .0 8 1 * * 0 .0 3 1 * * 0 .0 0 8 -0 .0 2 -0 .0 5 * -0 .0 8 * * -0 .1 3 * * -0 .1 1 * * -0 .0 2 0 .0 9 3 * * 0 .0 1 9 -0 .0 2 N 0 .1 6 3 * * 0 .1 7 7 * * 0 .1 5 8 * * 0 .1 1 1 * * 0 .0 8 8 * * 0 .0 9 4 * * 0 .0 5 2 * * 0 .0 5 5 * * 0 .0 5 6 * * 0 .1 3 8 * * 0 .1 1 1 * * 0 .1 2 9 * * 0 .1 2 4 * * 0 .1 2 8 * * IC 0 .1 5 7 * * 0 .1 3 9 * * 0 .1 2 5 * * 0 .0 8 3 * * 0 .0 7 6 * * 0 .0 4 8 * 0 .0 1 8 -0 .0 2 -0 .0 5 0 .0 2 8 0 .0 4 4 * 0 .1 1 2 * * 0 .0 8 3 * * 0 .0 8 2 * * A L L 0 .1 1 8 * * 0 .1 1 7 * * 0 .1 2 0 * * 0 .0 6 9 * * 0 .0 4 4 * * 0 .0 1 9 -0 .0 1 -0 .0 3 * -0 .0 6 * * 0 .0 0 3 0 .0 3 1 * 0 .1 0 5 * * 0 .0 6 1 * * -0 .6 9 * E st im a te d O v e r- D is p e rs io n P a ra m e te r φ fo r A L L 3 .5 0 8 .2 0 9 .4 5 4 .5 3 7 .5 6 8 .9 1 8 .7 4 8 .6 1 1 1 .7 1 2 .0 1 2 .4 1 3 .6 1 1 .3 6 .2 4 T a b le 2 : E st im a te d li n ea r tr en d s (a n n u a li ze d ) fo r th e lo g a ri th m o f th e m ea n co u n t η i = lo g (E (y i) ) d efi n ed in S ec ti o n 3 .1 . V a lu es si g n ifi ca n t a t th e 5 % le v el a re d en o te d w it h ‘∗’ , w h il e th o se si g n ifi ca n t a ft er a B o n fe rr o n i a d ju st m en t fo r th e 1 4 h o u rs o f th e d a y a re d en o te d w it h ‘∗∗ ’. A ls o re p o rt ed a re th e es ti m a te s o f th e o v er -d is p er si o n p a ra m et er φ fo r th e cu m u la ti v e co u n t A L L . 34 Hour 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Univariate Effects TEMP L L L L S S L L L L S S S S MTEMP S S S S S S S S S S L S S S RAIN S S S S S L S L L S L L L L RRAIN L L L L L S L L L L S S S S HMD L L L L L S S S L L L S L L WIND L S S S S S S S S S L L L S Bivariate Interaction Effects TEMP & WIND . . . . . . . . . . . . . B MTEMP & RAIN . . . . B . . . . . . . . . MTEMP & HMD . . . . . B . . . . . . . . MTEMP & WIND . . . B B B . . B B . . . . RAIN & WIND . . . . . . B . . . . . . . HMD & WIND . . . . . . B B . . . . . . Table 3: Meteorological effects selected for the cumulative count ALL. Univariate effects labelled with ‘L’ or ‘S’ were identified as linear or nonlinear, respectively. Bivariate interac- tion effects identified as zero are labelled with ‘.’, and non-zero with ‘B’. Omitted bivariate interactions were zero at all times of the day. 35 H o u r M o rn in g P ea k (0 6 :0 0 -0 9 :0 0 ) E v e n in g P ea k (1 6 :0 0 -1 9 :0 0 ) L o o p N o 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 0 .4 2 * 0 .0 7 0 .1 6 -0 .4 1 * 0 .0 9 0 .9 5 * * 0 .3 3 0 .1 1 0 .5 6 * 0 .8 1 * * 0 .8 1 * * 0 .3 * 0 .1 3 0 .0 9 2 0 .9 5 * 0 .2 1 -1 .1 7 * * -0 .2 9 0 .1 1 0 .1 5 0 .2 6 -0 .9 8 * -0 .4 9 -1 .3 * * -0 .1 9 0 .0 2 0 .5 3 -0 .5 3 3 0 .6 5 * * 0 .8 5 * * 0 .8 2 * * 0 .6 5 * * -0 .1 0 .5 9 0 .2 6 0 .5 0 .9 7 * 1 .2 2 * 1 .1 2 * * 0 .9 * * 0 .8 9 * * 0 .8 2 * * 4 -0 .4 7 * -0 .0 2 0 .3 2 * * -0 .2 7 0 .0 4 -0 .5 3 -0 .2 4 -0 .1 7 -0 .1 5 -0 .2 7 -0 .0 2 0 .0 3 0 .2 6 * 0 .3 9 * 5 0 .3 2 * 0 .0 7 -0 .0 2 0 .0 9 0 .1 0 .3 4 * 0 .3 8 * -0 .0 4 0 .4 6 * 0 .1 8 0 .3 8 * * 0 .0 9 0 .0 1 0 .0 8 6 0 .4 6 * 0 .0 5 -0 .1 7 0 .0 6 -0 .2 1 0 .2 6 0 .3 0 .4 -0 .1 5 -0 .4 7 * -0 .4 8 * 0 .1 2 0 .1 2 -0 .1 1 7 1 .2 5 * * 0 .3 9 * * 0 .2 4 * 0 .4 8 * * 0 .1 2 0 .1 8 0 .1 0 .2 3 0 .4 7 0 .1 3 0 .2 1 0 .0 4 -0 .4 2 * -0 .1 5 8 0 .5 1 * * 0 .5 6 * * 0 .1 6 0 .3 4 * 0 .1 0 .3 5 -0 .0 2 0 .4 4 0 .3 2 0 .9 3 * * 0 .4 * 0 .4 4 * * 0 .8 2 * * -0 .0 5 9 -0 .2 2 0 .5 6 * * 0 .2 1 * 0 .3 9 * 0 .5 2 0 .4 1 0 .5 8 * 0 .0 6 0 .5 1 * 0 .1 0 .8 7 * * 0 .5 6 * * 0 .4 4 * * 0 .4 7 1 0 -0 .8 * * -1 .2 4 * * -0 .6 8 * * -0 .1 5 0 .1 3 0 .3 4 0 .2 2 -0 .2 7 0 .2 2 0 .2 3 -0 .3 9 -0 .7 3 * * -0 .4 1 * -0 .7 6 * * T a b le 4 : E st im a te s o f α P , th e cr o ss -e la st ic it y w it h re sp ec t to p et ro l p ri ce s, u si n g th e P o is so n m o d el fo r a ll te n lo o p s a n d ti m es o f th e d a y. V a lu es si g n ifi ca n t a t th e 5 % le v el a re d en o te d w it h ‘∗’ , w h il e th o se si g n ifi ca n t a ft er a B o n fe rr o n i a d ju st m en t fo r th e 1 4 h o u rs o f th e d a y a re d en o te d w it h ‘∗∗ ’. 36 H o u r M o rn in g P ea k (0 6 :0 0 -0 9 :0 0 ) E v e n in g P ea k (1 6 :0 0 -1 9 :0 0 ) 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 C o u n t P o is so n M o d e l N 0 .6 5 * * 0 .1 6 0 .0 2 0 .1 9 * 0 .0 3 0 .2 5 0 .2 6 0 .1 4 0 .2 8 -0 .0 1 0 .1 5 0 .0 9 -0 .0 9 0 .0 1 S E 0 .1 1 0 .3 9 * * 0 .3 5 * * 0 .3 1 * 0 .0 4 0 .3 7 0 .3 9 0 .0 6 0 .4 6 0 .3 9 0 .6 6 * * 0 .5 * * 0 .5 4 * * 0 .1 9 IC 0 .4 8 * * 0 .5 5 * * 0 .3 1 * * 0 .2 9 * * 0 .0 8 0 .3 8 * 0 .3 8 * 0 .2 9 0 .5 9 * 0 .4 3 * 0 .6 2 * * 0 .4 9 * * 0 .3 9 * * 0 .3 4 * * A L L 0 .3 1 * 0 .3 3 * * 0 .2 1 * 0 .2 4 * 0 .0 4 0 .3 4 * 0 .2 8 0 .0 6 0 .4 * 0 .2 2 0 .4 5 * * 0 .3 8 * * 0 .3 * 0 .0 8 C o u n t L o g -L in ea r M o d e l N 0 .6 9 * * 0 .2 3 * 0 .0 7 0 .1 5 0 .0 4 0 .2 2 0 .3 4 * 0 .2 2 0 .3 2 * 0 .0 1 0 .1 9 0 .1 3 -0 .0 6 0 .0 2 S E 0 .1 3 0 .3 5 * * 0 .3 4 * * 0 .4 * * 0 .2 7 0 .4 1 0 .4 8 * 0 .0 9 0 .4 7 0 .2 9 0 .6 1 * * 0 .5 4 * * 0 .4 9 * * 0 .1 8 IC 0 .4 8 * * 0 .5 6 * * 0 .3 1 * * 0 .2 8 * * 0 .1 6 0 .3 3 * 0 .4 5 * 0 .3 9 * 0 .5 6 * * 0 .3 9 * 0 .5 9 * * 0 .4 9 * * 0 .3 6 * * 0 .3 2 * A L L 0 .2 6 * 0 .3 3 * * 0 .1 8 * 0 .2 5 * 0 .1 0 .3 6 * 0 .3 8 * 0 .1 6 0 .4 1 * 0 .2 3 0 .4 8 * * 0 .3 4 * * 0 .2 6 * 0 .1 4 T a b le 5 : E st im a te s o f th e cr o ss -e la st ic it y w it h re sp ec t to p et ro l p ri ce s u si n g th e P o is so n m o d el fo r th e fo u r cu m u la ti v e co u n ts . V a lu es si g n ifi ca n t a t th e 5 % le v el a re d en o te d w it h ‘∗’ , w h il e th o se si g n ifi ca n t a ft er a B o n fe rr o n i a d ju st m en t fo r th e 1 4 h o u rs o f th e d a y a re d en o te d w it h ‘∗∗ ’. 37 H o u r L o ca ti o n 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 N 0 .9 9 5 0 .7 8 9 0 .5 9 3 0 .8 7 4 0 .6 9 3 0 .8 7 4 0 .0 9 0 0 .9 8 9 0 .6 9 3 0 .8 3 4 0 .9 3 9 0 .1 2 0 0 .0 7 8 0 .7 4 2 S E 0 .6 2 9 0 .8 6 6 0 .9 7 6 0 .8 2 4 0 .1 3 0 0 .5 3 0 0 .1 9 5 0 .1 3 0 0 .1 5 0 0 .0 0 6 0 .0 1 9 0 .2 2 2 0 .5 3 0 0 .7 7 8 IC 0 .9 3 2 0 .6 2 6 0 .3 9 1 0 .8 6 4 0 .1 4 8 0 .6 7 7 0 .0 1 8 0 .4 3 4 0 .0 8 3 0 .0 2 2 0 .0 5 2 0 .0 7 1 0 .0 8 3 0 .9 0 1 A L L 0 .9 0 0 0 .5 7 3 0 .5 2 4 0 .8 2 0 0 .6 2 4 0 .9 0 0 0 .1 1 0 0 .6 7 4 0 .1 6 7 0 .2 1 7 0 .1 4 6 0 .1 4 6 0 .1 2 7 0 .8 6 2 T a b le 6 : T h e p -v a lu es fo r K o lm o g o ro v -S m ir n o ff te st o f n o rm a li ty fo r th e st a n d a ri ze d P ea rs o n re si d u a ls . 38 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 050010001500 H o u r counts H o u rl y C o u n ts , S p ri n g 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 050010001500 H o u r counts H o u rl y C o u n ts , S u m m e r 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 050010001500 H o u r counts H o u rl y C o u n ts , A u tu m n 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 050010001500 H o u r counts H o u rl y C o u n ts , W in te r Figure 1: Hourly city-wide count (ALL) broken down by both time of day and southern hemisphere season. 39 05:00 07:00 09:00 11:00 13:00 15:00 17:00 19:00 0 20 40 60 80 100 120 140 160 180 Time of Day H o u rl y V o lu m e Tues 21 Nov 06 Wed 21 Nov 07 Figure 2: Plot of hourly volumes at the St Georges Road inductive loop between 06:00 and 19:00. The blue line depicts the counts on 21 November 2006, while the red line depicts the counts on 21 November 2007. 40 ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● 0 4 0 0 0 8 0 0 0 co u n ts 2006 2007 2008 Total Bike Counts (weekdays) 1 .2 1 .4 1 .6 p ri ce in A U D 2006 2007 2008 Average Weekly ULP Price Figure 3: Total daily counts across all loops and average metro petrol prices in Melbourne between December 2005 and June 2008. 41 0 5 10 15 20 25 30 − 0 .5 0 .0 0 .5 TEMP (degree centigrade) m (T E M P ) (a) 10 15 20 25 30 35 40 − 0 .5 0 .0 0 .5 MTEMP (degree centigrade) m (M T E M P ) (b) 0.0 0.2 0.4 0.6 0.8 1.0 − 0 .5 0 .0 0 .5 RAIN (mm/hour) m (R A IN ) (c) 0.0 0.2 0.4 0.6 0.8 1.0 − 2 .0 − 1 .0 0 .0 0 .5 1 .0 RRAIN (mm/hour) m (R R A IN ) (d) 20 40 60 80 100 − 0 .5 0 .0 0 .5 HMD (%) m (H M D ) (e) 0 10 20 30 40 − 0 .5 0 .0 0 .5 WIND (km/h) m (W IN D ) (f) Figure 4: Estimated meteorological effects for the cumulative count ALL during the morning peak period. The estimates are for 06:00 (solid line), 07:00 (dashed lines) and 08:00 (dotted lines). The shaded areas are 95% confidence bands for the estimates at 06:00. 42 0 5 0 1 5 0 2 5 0 3 5 0 −0.4−0.20.00.20.4 d a y o f ye a r s(day of year) (a ) 1 0 1 5 2 0 2 5 3 0 3 5 4 0 −0.8−0.40.00.2 T E M P ( d e g re e s ce n tig ra d e ) m(TEMP) (b ) 1 0 1 5 2 0 2 5 3 0 3 5 4 0 −0.8−0.40.00.2 M T E M P ( d e g re e s ce n tig ra d e ) m(MTEMP) (c ) 0 .0 0 .5 1 .0 1 .5 −0.8−0.40.00.2 R R A IN ( m m /h ) m(RRAIN) (d ) 2 0 4 0 6 0 8 0 −0.8−0.40.00.2 H M D ( % ) m(HMD) (e ) Figure 5: Estimated nonlinear effects for the cumulative count ALL at 17:00, along with 95% confidence bands for the function estimates. The distribution of the independent variables are indicated on the horizontal axes. 43 6 8 10 12 14 16 18 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 Hour R .s q u a re Location: N season wday petrol meteorological 6 8 10 12 14 16 18 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 Hour R .s q u a re Location: SE season wday petrol meteorological 6 8 10 12 14 16 18 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 Hour R .s q u a re Location: IC season wday petrol meteorological 6 8 10 12 14 16 18 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 Hour R .s q u a re Location: ALL season wday petrol meteorological Figure 6: R2 plots for the four cumulative counts N, SE, IC and ALL against time of day. The statistic R2full is plotted as hollow circles, while the partial R 2 values are given as different line types for models excluding (i) seasonal component s (bold); (ii) day of week dummies (long dashed); (iii) meteorological effects m (dot-dashed); and, petrol price term αP (small dots). 44 Hourly Count H o u rl y C o u n t (c) 6 7 8 9 10 11 12 13 14 15 16 17 18 19 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Hourly Count H o u rl y C o u n t (a) 6 7 8 9 10 11 12 13 14 15 16 17 18 19 6 7 8 9 10 11 12 13 14 15 16 17 18 19 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 4000 6000 8000 10000 1 2 3 4 5 6 7 x 10 −4 Total Daily Count P ro b a b ili ty (d) 1200 1400 1600 1800 700 800 900 1000 1100 1200 1300 1400 1500 1600 (b) E (Y 1 2 |Y 3 ) Y 3 2 April 2008 1 April 2008 2 April 2008 1 April 2008 Figure 7: Bayesian posterior inference from the fitted Gaussian copula model, with Poisson regression margins, for the 14 hourly observations of the cumulative count ALL. Panel (a): matrix of pairwise Spearman correlations ρsj,k. Panel (b): expected evening peak count at 17:00 (j = 12), conditional upon the morning peak count at 08:00 (j = 3). Panel (c): partial correlations from the copula parameter matrix C. Panel (d): distribution from the copula model of total daily counts, ytotali , on April 1 and 2, 2008. 45 Melbourne Business School From the SelectedWorks of Michael Stanley Smith 2011 Bicycle Commuting in Melbourne During the 2000s Energy Crisis: A Semiparametric Analysis of Intraday Volumes bike.dvi