key: cord-0065820-wkdmfuf9
authors: Balla, Bhavani Shankar; Sahu, Prasanta K.; Pani, Agnivesh
title: Are Freight Production Models Transferable between Urban and Suburban Areas? Guiding Model Transfer in Geographically Sprawling Indian Cities
date: 2021-07-15
journal: J
DOI: 10.1007/s40030-021-00556-7
sha: 4822d54bca627b61f88a3c5dbe0abe4f1fb619ba
doc_id: 65820
cord_uid: wkdmfuf9

Investigating the spatial transferability of freight generation (FG) models is an imperative research need to enable the usage of formerly estimated model parameters in new application contexts with or without the usage of local data. By understanding how to transfer models (and to what extent), planning agencies in large countries like India can save freight survey costs in regions where they lack the institutional capacity and resources. Due to geographically sprawling nature of most of the Indian cities, an important research question regarding transferability is whether the models developed for urban areas can provide accurate estimates of freight activity in the suburban areas or vice versa. This paper aims to provide two solutions to this problem: (i) compare the relative effectiveness of transferability depending up on the direction of transfer and (ii) assess which models can be transferred and which cannot. Data collected from seven cities are used for this study. A set of freight production (FP) models are developed using this data to understand the differential influence of geographical location and industry segment on the model coefficients. The estimated FP models show that suburban establishments exhibit significantly higher FP rates as compared to other establishments. Subsequently, transferability direction and accuracy are determined using standard metrics such as transfer R(2), relative aggregate transfer error and transfer index. The transferability findings will provide actionable insights into development of FP models in regions with data constraints, which is of great value in an era of declining budgets for travel surveys.

Abstract Investigating the spatial transferability of freight generation (FG) models is an imperative research need to enable the usage of formerly estimated model parameters in new application contexts with or without the usage of local data. By understanding how to transfer models (and to what extent), planning agencies in large countries like India can save freight survey costs in regions where they lack the institutional capacity and resources. Due to geographically sprawling nature of most of the Indian cities, an important research question regarding transferability is whether the models developed for urban areas can provide accurate estimates of freight activity in the suburban areas or vice versa. This paper aims to provide two solutions to this problem: (i) compare the relative effectiveness of transferability depending up on the direction of transfer and (ii) assess which models can be transferred and which cannot. Data collected from seven cities are used for this study. A set of freight production (FP) models are developed using this data to understand the differential influence of geographical location and industry segment on the model coefficients. The estimated FP models show that suburban establishments exhibit significantly higher FP rates as compared to other establishments. Subsequently, transferability direction and accuracy are determined using standard metrics such as transfer R 2 , relative aggregate transfer error and transfer index. The transferability findings will provide actionable insights into development of FP models in regions with data constraints, which is of great value in an era of declining budgets for travel surveys.

Keywords Spatial transferability Á Model updating Á Freight production Á Regression Á Establishment-based freight survey Á Transfer R 2 

Large-scale establishment-based freight data are crucial for forecasting freight demand. However, developing countries like India lack an established commodity flow survey (CFS) practice despite the presence of multiple megacities and pressing freight transport planning needs of logisticsdriven economies. Freight demand forecasting models are thus of national and regional importance to plan for facilities that provide consumers with access to a wide variety of goods and services without traveling long distances [1, 2] . In particular, having solid estimates about the freight activity is critically important to understand why, how and where the freight activity takes place and what kind of infrastructure and policies need to be provided to respond effectively to the growing logistical requirements of businesses and households [1, 3] . Nonetheless, the freight model system is data intensive, and the planning agencies allocate a significant amount of resources in terms of cost and time on data collection programs [4, 5, 6] .

The spatial transferability of the models could be one way for the metropolitan planning organizations (MPOs) to reduce the commitment of the significant portion of the financial resources for several data collection programs for developing freight modeling systems [7] . Investigations on model transferability, though not new, research is limited to a handful of freight demand studies [7] [8] [9] [10] . Some notable studies addressing transferability of passenger trip generation and mode choice models are available [8] [9] [10] [11] . Estimated model parameters can be used in an application context using the local dataset of relatively lesser sample size [11] , and thereby, MPOs can plan facilities with time and budget constraints. The savings in cost and time are beneficial for a freight demand modeling system as it requires significant data from an establishment-based freight survey (EBFS) or CFS. In the absence of CFS and inaccessibility to freight data sources owing to ownership and privacy in a country like India and several other South-Asian countries, EBFS is a smarter option for obtaining data on freight activities [12] . However, with rigorous surveyor training, lower complete response rate, etc., EBFS becomes time overrun, incurring higher unit cost per data to meet the required sample size to represent the population [12, 13] . These issues highlight the importance of investigating the freight demand model transferability in an era where model estimation and diagnostic studies are evolving [1, 4, [14] [15] [16] .

In the last two decades, due to rapid urbanization, freight movements in India have increased considerably. The nationwide freight movement (FM) was 2000 BTKM (Billion Ton Kilometer) during 2011-2012 and is estimated to grow at an annual rate of 9.7% till 2031-32, which will lead FM to surpass 13,000 BTKM. At the same time, the overall passenger traffic is projected to rise by 15% approximately to reach 168,875 BPKM (billion passenger kilometer) in 2031-32 from 10,375 BPKM in 2011-12 [17] . The increase in truck activities more often conflicts to the commuter traffic, subsequently resulting to higher travel time or congestion experience in the urban road network. The conflicting phenomenon causes longer queues, network disruptions, environmental damage, poor air quality and degradation of overall quality of life in the urban areas.

As per the census 2011, the urban areas contribute to 31% of Indian population and gross domestic product (GDP) share of the cities is 63% of nation's GDP. Due to rapid urbanization, by 2030, it is predictable that the urban population increases to 40% and contributes to 75% of the national GDP [18] . Also, with the launch of 'Smart City Mission' [19] to improve the livability index, it is expected to notice accelerated economic activities and subsequent increase in truck trips. Also, Government of India has launched the 'Make in India' program to improve the ease of doing business index on a global level. This program influenced the manufacturing sector with a record growth of 7.9% in the gross value added (GVA) during 2016-17; GVA growth was 5.9% in 2017-18 and estimated to be 8.1% in 2018-19 [20] . The government's flexible approach in several policies for small and medium scale enterprises (SMEs) has been attracting investors from the local to national level to develop new industries both in peri-urban and suburban regions as there is a scarcity of land availability within the urban regions. The economic expansion programs initiated by the Government of India will boost the economy, nonetheless, it will also increase the trucking activities both at a firm and network level. Incidences like the ongoing pandemic COVID-19 have already added more truck-kilometers in all segments (first mile, long haul and last mile) of freight travel. The freight movement was reported to be 65.4 billion tons in April 2020 when nationwide lockdown was declared by the Government of India and increased to 110.110 billion tons in November 2020 [21] . The behavioral shift in consumer purchase preferences will keep on adding more truck-kilometers, and eventually, higher trucking will enhance the several negative externalities from trucking such as congestion, noise and air pollution in urban and suburban regions. To mitigate the external effects and augment the existing infrastructure, planning bodies require more representative freight demand models in both urban and suburban contexts. However, concerning the absence of CFS and EBFS in India makes it burdensome to planners and policy makers to develop a model system at the national level. Also, as the model system is data intensive, it is often difficult to allocate financial resources for data collection program within budgetary constraints.

An imperative study is required to investigate on the spatial transferability of freight generation models from one region to another to understand the degree of transferability between two different regions in a similar or different states. By understanding the extent of transferability, the planning agencies in developing countries like India can reduce the costs for the freight surveys in the regions where they lack institutional capacity and resources. Due to the geographically sprawling nature of the Indian cities, an important research question is about the extent of transferability when the estimated urban model is transferred to suburban region to estimate the freight activity or vice versa. This paper aims to provide two solutions to this problem: (i) compare the relative effectiveness of transferability depending up on the direction of transfer and (ii) assess which models can be transferred and which cannot. We developed models in an estimation context, separately for urban and suburban regions using data of 432 establishments collected through the EBFS in seven cities in Kerala -A southern coastal state in India. A 'a posteriori' segment classification scheme from the previous research [12] was used to estimate the classified models. The estimated models were assessed for transferability to determine the model transfer direction and accuracy levels. Next section discusses the most relevant literature, and after that, a brief outline of the current research is presented.

The modeling of freight generation is bi-conceptual which includes the concepts of freight generation (FG) and freight trip generation (FTG) [22] . FG focuses on weights, and FTG explains number of truck trips. An overview of previous studies reveals the use of various explanatory variables to explain FG/FTG, such as gross floor area [3, 23] , number of employees [3, 11, 15] , number of clients [24] , sales [25] , type of commodity [26] . Beagan et al. [27] developed several FG and FTG models based on employment, area of the building/establishment and type of land use by collecting the freight-related data from the USA. Sánchez-Díaz [3] developed a set of employment-based and area-based FG and FTG models using data collected from establishments in the City of Gothenburg, Sweden. Alho & de Abreu e Silva [16] used data collected from 604 retail establishments in the City of Lisbon, Portugal to develop a set of disaggregate FG models, using different modeling approaches.

In the context of developing economies, Pani et al. [15] developed city-specific disaggregate FG models for estimating FG and validated the explanatory power of years in business in addition to traditional business variablesgross floor area (GFA) and number of employees (NE) to model FG. Sahu & Pani [23] developed two sets of disaggregate FG models -(a) 52 practice-oriented FG models using ordinary least squares (OLS) regression technique, and (b) multiple classification analysis tables and nomograms, and examined the geographical disparities on FG at a regional level. The disaggregate FG models developed in aforementioned studies focus on forecasting FG at a regional level, and such models are useful for planning for inter-city freight transportation. Thus, there is a concerning lack of focus on the extent to which location of an establishment, i.e., whether the establishment is located in an urban or suburban area and acts as a modifier to the relationship between FG and establishment characteristics (as described by business variables), in the context of developing economies like India.

In the age of constrained survey assets and ever-developing demand for disaggregated travel information, examining the spatial transferability of models has become a pillar of travel demand analysis [28] . An overview of previous studies [29] [30] [31] [32] [33] [34] [35] [36] shows that various measures could be used to assess the transferability of the estimated models. For instance, Atherton & Ben-Akiva [37] used transferability test statistic to assess the transferability of work-trip modal-split model. In order to enhance the transferability, they also used the Bayesian update process. Taking forward their study, the updating techniques like transfer-scale approach and combined transfer estimator were developed to improve the accuracy of prediction of the transferred models [38] . Koppelman & Wilmot [39] used several measures such as transfer index (TI), transfer q 2 , root mean square error (RMSE), aggregate prediction statistic and relative aggregate transfer error (RATE) to assess transferability of disaggregate choice models.

The transfer q 2 is used to test the goodness of fit of the estimation context model when transferred to the application context [40] . To assess the transferability of ordered response model, Agyemang-Duah & Hall [10] employed weighted root mean square error (WRMSE), transfer pseudoR 2 , aggregate prediction statistic and root of sum of residual error. They further updated the model parameters of the estimated model by using scaling updating technique. McArthur et al. [41] used relative number of wrong predictions and standardized root mean square error to analyze the transferability of parameters in gravity models.

Transfer R 2 is used to assess the extent of transferability of linear regression models by Wilmot [42] . The key results obtained from some of the past studies are summarized in Table 1 . From these studies, it can be interpreted that the sample size has a significant impact on model transferability. Models with small sample sizes have low TI values.

In the US, for different locales, transferability appraisal was done, and it was discovered that the reach out of transferability is more between the regions inside the state than over the states [9] . The spatial analysis of shopping trip generation in Toronto metropolitan area showed that a directly transferred ordered response model performed well in predicting the application context [10] . Some key past studies [8, 47, 48] suggested that models can be transferable between regions with contextual similarities in terms of socio-demographic factors such as household income, auto-ownership, household structure, key employment type/industries, employment pattern, commuting pattern and degree of urbanization [7, 49] . Institute of Transportation Engineers manual [50] uses the suburban passenger trip data for estimation context and transfer the trip generation model as an application context for the prediction of trips in the urban area. In one of the very few studies, the freight trip generation models were assessed for transferability using RMSE as a statistical measure [11] . The authors also used synthetic correction technique to improve the transferability. However, previous research efforts on FG models have neither estimated nor assessed the transferability by exploring the urban-suburban divide in FG by commercial establishments.

The present research first develops separate sets of disaggregate freight production (FP) models for urban and suburban establishments and extends the evolving body of the literature on freight demand modeling for developing economies. These models are estimated by OLS regression technique using business size variables -Number of employees (NE) and gross floor area (GFA) -as predictors. Model estimation and transferability assessment were done for two different cases: (i) for all urban and suburban establishments, (ii) 'a posteriori' segmentation [12] of urban and suburban establishments. Single variable models were estimated without intercept [23] for the above two cases. Initially, the FP models' transferability assessment was carried out using the naïve transfer approach, and later for the FP models updated using combined transfer estimation. The metrics used for the transferability assessment of naïve and updated models were (i) transfer R 2 (TR 2 ), (ii) transfer index (TI), (iii) weighted root mean square error (WRMSE) and (iv) relative aggregate transfer error (RATE).

A total of 54,170 establishments exist across the seven cities, namely Cochin, Calicut, Malappuram, Kannur, Palakkad, Thrissur and Kottayam, based on the economic census [51] . For EBFS, the minimum required sample size [52] for the study area is found to be 382 on the assumption of confidence level of 95%, response distribution of 50% and margin of error of 5%. The EBFS survey using face-toface interview was conducted on shippers (manufacturing units, assembling companies and raw material production sites) in Kerala, and a total of 432 completed responses from several industries were finally obtained consisting of information on weekly FP, NE and GFA of an establishment. The reliability of the data is cross-checked on item basis using publicly accessible information from various sources such as Commercial Tax Departments, and local search websites. Measurement tools in Google Earth are used for verifying the gross floor area of the establishment. NE and GFA were considered to model 'Weekly FP.' NE includes the total number of skilled and unskilled employees. GFA is the total covered area inside the building envelope of the establishment. The industrial sectors included in the survey were classified into multiple classes defined by the International Standard Industrial Classification (ISIC) system. We further used a posteriori 

The establishments are classified as urban and suburban establishments based on their geographical location. In this study, the urban area is defined based on the definition given by Government of India while carrying out the census study 2011. According to the Census of India 2011, the urban areas are the places governed by a municipal corporation, cantonment board or notified town area committee. Furthermore, places with a minimum population of 5000, a population density more than 400 persons per km 2 , and more than 75 percent of the men engaged in nonagricultural activities are regarded as urban areas [53] . The establishments in the urban area are grouped as urban establishments, and the remaining are grouped as suburban establishments. Out of 432 establishments, urban and suburban establishments are 327 and 105, respectively.

The descriptive statistics of FP and the establishment characteristics (GFA and NE) for both urban and suburban areas are presented in Table 2 . Mean FP values of the establishments vary among the urban and suburban regions. It can be seen that the mean FP of the suburban establishments is more than that of the urban establishments. Also, the means of GFA and NE are higher in case of suburban establishments. In regard to the classified industries, the average GFA and average NE are the highest in the urban-(900.1 m 2 ) and suburban-(35) establishments, respectively, and these establishments belong to segment 2. Whereas the highest average FP is noticed in segment 1 of the suburban establishments. Pearson coefficients and the scatter plots indicate strong linear association between FP and business variables (GFA and NE) for urban and suburban establishments. Pearson correlation coefficient values for FP explanatory variables vary between 0.767 and 0.849 which indicates a strong linear relationship. The scatter plots for urban and suburban establishments are shown in Fig. 1 . The scatter plots between weekly FP and explanatory variables (GFA and NE) for urban and suburban establishments exhibit a very strong positive correlation. These findings suggested a linear model is appropriate to estimate the weekly FP.

The linear regression is the most commonly employed statistical technique for the developing predictive models that quantify FG [1, 3, 15, 23] . The practitioners prefer regression-based models due to its ability to describe the causal relationship and statistical robustness [54] . In this study, the FP models are estimated using OLS regression technique with a single business size (GFA and NE). For conceptual reasons, the models are developed without intercept. The logical reason for not using intercept is that no economic activity is noticed when employment or gross floor area is zero [23] . One of the largest investigations on FG models is in agreement with this practice [1] . The model structure used in this study is given in Eq. (1).

where FP X i = Freight production, in tons/week, by an establishment, predicted using business size variable X i 8i; i = 1 & 2 (GFA & NE); b i = OLS estimator for slope of the regression line; e X i = stochastic error term such that E e X i ð Þ ¼ 0.

The model estimation results are given in Tables 2 and  3 . These tables constitute the sample size (n), coefficient of determination (R 2 ), standard error (S.E), standard deviation (S.D) and t-statistics. All the developed FP models of urban and suburban establishments are significant at 99.9% and the R 2 value lies between 0.588 and 0.721. On the other hand, FP models for the three industrial segments show that all the parameters are significant at 99.9% confidence interval, and goodness of fit (R 2 ) varies between 0.656 and 0.907.

The FP rates of the urban and suburban establishments can be explained from the FP models presented in Table 3 . Area-based FP rates for the urban and suburban establishments are 1.87 tons/100 m 2 and 2.36 tons/100 m 2 . On the other hand, the employment-based FP rates are 0.45 tons/employee and 0.61 tons/employee. From these FP rates, it can be inferred that the weekly FP rates in the suburban regions are more than that of urban regions. The more availability of land with affordable prices and employment of more people for lesser wages may be a logical explanation for higher FP rates in these regions. This explanation can be correlated with the theories of production functions in neoclassical economics, which describes that the output produced (quantity of products) is dependent on a set of inputs (land, employment, capital) used in the production.

In Table 4 , FP models for different industrial segments are presented. Area-based FP rates for the urban establishments range from 1.31 tons/100 m 2 (segment 3) and 3.52 tons/100 m 2 (segment 1). Whereas employment-based FP rates in these establishments vary from 0.32 tons/employee (segment 3) to 1.80 tons/employee (segment 1). The FP rates for the suburban establishments range from 1.53 tons/100 m 2 to 4.11 tons/100 m 2 and 0.4 tons/employee to 1.87 tons/employee for area-based and employment-based FP rates, respectively. When the industrial establishments are compared irrespective of their geographical location, the higher FP rates are noticed in the low value density industries (segment 1), which include ISIC 16: Wood, wood products, furniture and fixtures and ISIC 24-25: Basic metal, alloy, metal products. The possible reason behind the higher FP rates is that these industries save more money on overhead charges (handling, packing and transportation of freight), and the investment of these savings in production leads to more output (FP rates). However, from the developed models, it can be concluded that FP rates of the suburban establishments are more in the area-based and employment-based models on comparison with the urban FP models. Also, on comparing the models' predictive ability based on the standard error, it is evident that the urban models are more suitable for predicting FP than that of suburban models.

Transfer R 2 (TR 2 ), transfer index (TI), weighted root mean square error (WRMSE) and relative aggregate transfer error (RATE) are the statistical metrics which are used for transferability assessment. The first metric is TR 2 , which measures the proportion of variation of the dependent variable in application context data captured by the transferred model [39, 55] . The maximum possible value of TR 2 is 1, which indicates that the estimation context model (transferred model) is entirely transferable. A value equal to zero indicates that the transferred model has no explanatory power in the application context. The negative value of TR 2 indicates that the results from the transferred model are inferior to the mean of application context data [42] . The second metric, TI defined as TR 2 divided by R 2 , measures how well a transferred model in the application context data performs relatively to application context model, and it varies between 0 to 1 [42, 55] . Like TR 2 , the TI value of 1 specifies perfect transferability, and a value equal to zero explains that the transferred model describes nothing of application context. The negative value of TI is not suggested, as it indicates misleading outcomes [9, 42] . If TI is more than 1, it represents that the transferred model is superior to the local model [9] . The third metric is weighted RMSE (WRMSE), which is used as an index to measure relative error of the model that is transferred from estimation context to application context using local data (application context data) [39] . The RATE is the fourth metric which quantifies the ratio between the transfer WRMSE and local WRMSE [39] . The equations of these metrics are as follows: 

where,Ŷ i = predicted dependent variable values produced by transferred model (estimation context) operating on independent variable values in the application context. data; Y = mean of dependent variable values in application context data; Y = observed dependent variable values in application context data; R 2 = coefficient of determination of linear regression model fitted to application context data; WRMSE t and WRMSE a represent the calculation of WRMSE of transferred model and application context model, respectively, using application context data; Relative error measurement, REM = ðŶ i À YÞ=Ŷ; n = number of observations in the linear regression model of the application context. 

There are various methods for updating the transferred model. In this study, for updating the coefficients, combined transfer estimation technique is used. This approach assumes transfer bias (d = b t -b a ) to be non-zero [38, 56] which makes combined transfer estimation technique as a reliable approach.

where b 0 is the updated model parameter; b t and b a are the parameters of the estimation context model and application model, respectively; r t and r a are corresponding standard deviations in estimation and application contexts; d is the difference of parameters of estimation context model and application context model.

The FP models of urban and suburban establishments within Kerala are assessed for the transferability. The assessment results of naïve and updated FP models with GFA and NE as explanatory variables are presented in Tables 4 and 5, respectively. In these tables, the naïve and updated models are compared to assess the transferability improvement in naïve models on parameter updating. The results suggest that the updated models are better than naïve models in all the cases. The previous studies are in accordance with this result [9] . As updated models perform better, most of the discussion in this study is about updated models.

When transferability assessment is carried from urban to suburban, it is evident that most of the area-based urban FP models can be transferred to suburban context. Similar observations are seen when employment-based urban FP models are transferred to suburban context. Only in case of 'segment 1,' the urban model is not transferable to suburban context. TR 2 and TI in case of segment 1 are negative, suggesting that the transferred models perform worse than locally estimated models. This negative value is achieved when the transferred model predicts behavior contrary to that observed [42] . When the reverse assessment of transferability is done (suburban to urban), TI value of updated models ranges from -1.245 to 0.053 (area-based and employment-based together). This shows that even though the urban models are transferable to suburban context, suburban is not transferable to urban. It is evident that transferability is not symmetric between two regions within the state, and this finding is consistent with a previous study [57] . Also, it is noticed that the area-based models are more transferable compared to employment-based models. The possible reason is that compared to area, employment is not best representing freight activities due to possible variations in employment among various industries and the replacement of manpower with automation. In addition, it is noticed that some industrial segment models are not transferable. The possible reason for this is sample size used in developing these models is too small. The small data sets give poor transferability results [46] .

In this study, WRMSE and RATE are used to assess the aggregate-level prediction of the transferred model. On updating the coefficients of models, it is noticed that for updated models, RATE values have improved. In general, RATE values of updated come closer to 1. The RATE value close to 1 suggests that the aggregate prediction error of transferred model in local data is equal to error of locally estimated model. When area-based updated models in urban context are transferred to suburban, RATE values range between 1.153 and 1.379 for area-based urban updated models in suburban context. These values indicate that the prediction error of area-based urban model in suburban context lies between 15.3% and 37.9%. When the employment-based models are transferred from urban to suburban, the error ranges between 3.7% and 28.2%. From the transferability assessment results, it is evident that the area-based urban models are more transferable than employment-based urban models. Also, when 'a posteriori' segmentation FP models are compared for transferability, it is noticed that all urban models are transferable except low value density industry (segment 1) models. For better perspective of the transferability assessment results, TI and RATE values are considered. As TI is dependent on TR 2 and RATE is dependent on WRMSE, it is appropriate to consider these two metrics. From RATE values, the absolute transfer errors are calculated, and absolute transfer error is calculated as percentage of absolute value of (1 -RATE). In case RATE is equal to 1, the absolute transfer error is zero. If error is zero, then the transferability is said to be perfect. Considering TI and absolute transfer error values, a graph is plotted among them to visualize the transferability. Figures 2 & 3 represent the graph to visualize various transferability assessments of all establishments and a posteriori segmentation, respectively. In the graphs, only the assessment metrics of updated models are considered as transferability results in these models are better than the naïve model. The summary of the transferability assessment results is given in Fig. 4 .

Joint context estimation is a useful approach to assess transferability by estimating a joint model (base model). The base model is developed by combining the data from both the estimation context and application context [46] . In this approach, the 'difference parameters' are estimated to capture the differences in the parameters of estimation and application context. A simple t-test on these 'difference parameters' helps in understanding whether or not the base model is transferable [8, 58] . In this study, first, single variable base models (area-based and employment-based) are estimated using the data from the urban and suburban regions. Next, for each selected region, a dummy variable for that region is interacted with the variable in the base models to form 'difference variables.' From t-statistics results on these 'difference variables,' the following observations are made (i) the area-based base model is transferable to both urban and suburban contexts (ii) the employment-based base model is transferable only to the suburban context. From the t-test, we can determine whether the model is transferable but not the extent of transferability. These base models are further assessed using TI to find the extent of transferability in different regional contexts and different industrial segments. In addition, the TI values obtained from combined transfer estimation and joint context estimation are compared to understand if there is any improvement in transferability after pooling the data.

Tables 6, 7 represents the TI values obtained from the combined transfer estimation and joint context estimation. On comparing the base models and updated models, the extent of transferability of area-based base model (excluding industrial segments) to urban context has improved. In all other cases, the extent of transferability of updated models using combined transfer estimation technique is greater. The geographical dissimilarities of urban and suburban regions in terms of population density may be a possible reason for no improvement in TI values on transfer of base model. In this study, from the assessment result, it can be interpreted that the updated models are better transferable than the base models. 

This paper contributes to existing knowledge of freight demand generation in India by establishing statistically significant relations between FP and business size variables. For this reason, this study uses data from 432 establishments in Kerala, India, which are obtained through an EBFS. The establishments are categorized to urban and suburban based on the geographical location to analyze the freight demand model transferability assessment. Urban and suburban 'a posteriori' segmentation to interpret the influence of homogeneous industries. OLS regression method is used to develop single variable models without intercept. The models indicate that the FP rates in suburban regions are more significant than that of urban, irrespective of 'a posteriori' segmentation. Among the 'a posteriori' The financial burden on metropolitan planning organizations (MPOs) in freight data collection urges for spatial transferability of freight demand models. Transferability assessment is carried out using both urban and suburban models to analyze the model performance as well as the direction of transferability. Transferability assessment performance is carried out using TR 2 , TI, WRMSE and RATE. The models are directly transferred initially; later the naïve models are transferred by updating their coefficients using combined transfer estimation technique. The transferability assessment is done for naïve and updated models, and it can be seen that updated models offer better transferability performance. It is also noticed that most of the urban models are transferable in the suburban context. However, when the suburban models are transferred to the urban context, only a few models are transferable. These statements about transferability between urban and suburban conclude that transferability is asymmetric. The exclusion of spatial factors like population density, road density, road intersection, distance of establishment from seaport and city center, etc., in the models can be a possible reason for the asymmetric transferability between the urban and suburban regions. Also, the suburban models are not transferable because the sample size used for developing these models is small. Finally, among the urban models, area-based models are more transferable than employmentbased models. The medium and high value density industry (segment 2 and 3) urban models are transferable to suburban context, and among them, medium value density industry (segment 2) models are more transferable. In suburban models, only the area-based model of high value density industry (segment 3) is transferable. Joint context estimation technique is also used to check whether or not the pooled data models give better transferability results. We did not notice higher TI values in this case as compared to the combined transfer estimation technique. In general, as there is rapid growth of freight traffic and shortage of freight data in suburban regions, the spatial transferability of urban models to suburban context can overcome the freight demand model availability to a certain extent in suburban regions.

From this study, a number of inferences have emerged with significant implications for planning and policy making. The first finding is that the degree of spatial transferability widely varies among different industrial classes and depends on the following factors: (i) business variables employed for measuring FP (ii) type of community (iii) commodity value density of industry sectors. The first factor suggests that the area-based models show better transferability than the employment-based models. The second factor suggests that the transferability is possible from urban community to suburban community and not vice versa. The agencies looking forward to transfer FP models may do well if the transfer is done from regions with high population density to low population density. The last factor suggests that the transferability of models is better in medium and high value industries. In this study, the transferability of the FP models opens the possibility for transportation agencies, planners and practitioners to identify direction and the extent of transferability. The research findings are expected to be helpful in saving financial resources for data collection exercises and develop freight model system within budgetary constraints. However, like any other study, this study has certain limitations, which paves the way for further research.

The findings of this study may not be generalized as the data pertain to only one state in India. However, the methodology is generic and can be utilized to investigate the freight model transferability direction in cities in other Indian states. The findings of this research are much useful for the planning agencies in Kerala or in a similar coastal state in India or elsewhere. The results of this research will assist them in data collection programs, and subsequently useful for developing comprehensive mobility plans. This research is limited to intrastate model transferability assessment and requires further investigation on interstate model transferability. It may also be noted that small sample size is the limitation of this study. Models built with small data sets produce poor or moderate transferability findings, suggesting the significance of using sufficient samples in building models. The generalization of this study requires data with larger sample size from several cities across different states in India. Also, it is important to identify the factors causing the transferability to be asymmetric and to know the reasons for transferability of only some industrial segment models.

Using Commodity Flow Survey Microdata and Other Establishment Data to Estimate the Generation of Freight, Freight Trips, and Service Trips: Guidebook

Urban freight models

Modeling urban freight generation: A study of commercial establishments' freight needs

Planning, designing and conducting establishment-based freight surveys: A synthesis of the literature, casestudy examples and recommendations for best practices in future surveys

How can urban goods movements be surveyed in a megacity? The case of the Paris region

Modelling non-response in establishmentbased freight surveys: a sampling tool for statewide freight data collection in middle-income countries

Latentsegmentation-based approach to investigating spatial transferability of activity-travel models

Spatial transferability of travel forecasting models: a review and synthesis

Spatial transferability of person-level daily activity generation and time use models

Spatial transferability of an ordered response model of trip generation

Transferability of freight trip generation models

Comparative assessment of industrial classification systems for modeling freight production and freight trip production

A nationwide webbased freight data collection

Estimating freight flows for metropolitan area highway networks using secondary data sources

Modelling urban freight generation: a case study of seven cities in Kerala

Modeling retail establishments' freight trip generation: a comparison of methodologies to predict total weekly deliveries

Executive Summary. New Delhi 18. Ministry of Urban Development (2015) Smart City: Mission Statement & Guidelines 19. Ministry of Housing and Urban Affairs (2016) Smart Cities Mission

Ministry of Heavy Industries and Public Enterprises

Trip length distributions in commodity-based and trip-based freight demand modeling: Investigation of relationships

Freight generation and geographical effects: modelling freight needs of establishments in developing economies and analyzing their geographical disparities

Building a model of freight generation with a commodity flow survey

Freight-trip generation model

Freight generation models: Comparative analysis of regression models and multiple classification analysis

Quick Response Freight Manual

Assessing the spatial transferability of freight (Trip) generation models across and within states of India: empirical evidence and implications for benefit transfer

Examining the determinants of freight transport emissions using a fleet segmentation approach

A multi-objective genetic algorithm approach to design optimal zoning systems for freight transportation planning

Data collection and modeling of restaurants' freight trip generation for Indian cities

Designing freight traffic analysis zones for metropolitan areas: identification of optimal scale for macro-level freight travel analysis

Effects of business age and size on freight demand: decomposition analysis of indian establishments

Expenditure-based segmentation of freight travel markets: Identifying the determinants of freight transport expenditure for developing marketing strategies

Designing zoning systems for freight transportation planning: a GIS-based approach for automated zone design using public data sources

Assessing the extent of modifiable areal unit problem in modelling freight (trip) generation: relationship between zone design and model estimation results

Transferability and updating of disaggregate travel demand models

Approaches to model transferability and updating: the combined transfer estimator

Transferability analysis of disaggregate choice models

Intra-metropolitan transferability of mode choice models

The spatial transferability of parameters in a gravity model of commuting flows

Evidence on transferability of trip-generation models

Spatial transferability and updating analysis of mode choice models in developing countries

Evaluation of transfer methods for spatial travel demand models

Survey and empirical evaluation of nonhomogeneous arrival process models with taxi data

Spatial transferability of tour-based time-of-day choice models: Empirical assessment

Sequential logit dynamic travel demand model and its transferability

Choice experiments, site similarity and benefits transfer

Long-Distance and Rural Travel Transferable Parameters for Statewide Travel Forecasting Models

Trip and parking generation rates for different housing types: effects of compact development

Provisional Population Totals Paper 2 of 2011: India (Vol II

Transferability of standardized regression model applied to person-based trip generation

The combined estimator approach to model transferability and updating

Modeling the effect of land use on person miles traveled by using geographically weighted regression

Making advanced travel forecasting models affordable through model transferability

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Conflict of interest On behalf of all authors, the corresponding author states that there is no conflict of interest.