key: cord-0057139-ng27oa37 authors: Daniel, Cynthia Baby; Mathew, Samson; Saravanan, Subbarayan title: Network constrained and classified spatial pattern analysis of healthcare facilities and their relationship with the road structure: a case study of Thiruvananthapuram city date: 2021-03-08 journal: Spat DOI: 10.1007/s41324-021-00385-7 sha: 7044e3c4cc1669d7c0ad37ad62b9de28a3a262ee doc_id: 57139 cord_uid: ng27oa37 Equity of the utilization of medical facilities is extremely dependant on the spatial distribution of healthcare amenities. In this study, an attempt is made to study the geographic distribution of the various categories of healthcare facilities in the city of Thiruvananthapuram, South India using the centrographic analysis that includes the Mean Center, Standard Deviational Ellipse and Average Nearest Neighbour and it was found that there is an uneven distribution. Network-based point pattern analysis is also performed and cross k-function is used to determine the distribution of medical shops relative to the healthcare services. A weighted node approach is used to calculate the indices of centrality by weighing all nodes based on their degree using Geographical Information System (GIS). Multi centrality assessment model consisting of betweenness, closeness and straightness centrality is used to compute the weighted road centrality on a local and global scale. Kernel Density Estimation technique is applied to modify the centrality values and the vector points to a basic raster framework. Correlation analysis was performed to find the role of the network centrality on the location of healthcare facilities in the study area. Correlation values are greatest among category 2 hospitals; straightness centrality and betweenness centrality favours the location of category 1 hospitals and homeopathy hospitals respectively. The results confirm that network topology influences the location of medical facilities in the study area. India is one among the fast-growing developing countries steadily catching the pace in its journey to becoming a developed nation. The soundness of the healthcare system is a sine qua non for a healthy population and the overall upliftment of a nation on a global scale. Equity in the healthcare sector may be defined as persons in need of medical care, who receives equivalent treatment, regardless of their socio-economic status or household income [1] . An adverse effect of the uneven distribution of healthcare facilities is reduced access to health benefits to a certain geographical section or a weaker section of the society. Healthcare in India is operated by both public and private sectors; the former being functioned by Central government and State governments offering services to the population at subsidised rates. Kerala has been in the forefront among the Indian states in terms of health care performance. It has shown profound expertise in controlling the Nipah and Corona virus attacks and was appreciated world over for the tremendous effort put forth in dealing with these issues [2, 3] . The literacy, birth, death and infant mortality rates of Kerala have consistently been on a superior side than the national average from the early years of the last century as per the 2011 Census of India [4] and the reports of NITI Aayog [5] published by the Government of India. Is the spatial distribution of medical systems and facilities in Kerala even or is it affected by other factors such as road topology? These questions have great relevance in this scenario. Hence, a detailed and structured study of the geographical pattern distribution of healthcare facilities is required and an investigative study to explore the factors governing the distribution pattern is essential. An attempt has been made to investigate the role of the road topology in the distribution pattern of healthcare facilities in the capital city of Kerala. An in-depth study of how they are interrelated is required to instigate reformation and redevelopment works and to enable proper spatial allocation. These studies act as a precursor to the spatial allocation studies and spatial accessibility studies. Spatial pattern analysis has developed one step further thanks to network constrained pattern analysis as most of the urban activities happen alongside road networks. Traditional point pattern analysis refers to the analysis of patterns of point entities in a space, whereas network constrained pattern analysis refers to the analysis of patterns of point entities with reference to the road network. The former limits itself to the network of the Euclidean space and leads to false information as urban activities are distributed along the network on either side. Network constrained pattern analysis is performed using SANET toolbox of ArcGIS. Hence, network point pattern analysis is a relatively unexplored area of research and this study aims to investigate both the traditional point pattern analysis and network constrained pattern analysis. Multi centrality assessment is a concept that emerged in social networks and is now being widely used in GIStransportation studies [6] [7] [8] . It is an application tool that uses basic mathematics to analyse networks and spaces that are represented in a GIS environment. Centrality is primarily quantified in terms of betweenness, closeness and straightness centrality. Betweenness centrality of the node denotes the oftenness with which a node is traversed by the shortest path considering all the nodes in the network. Closeness centrality, on the other hand, reflects how far a given node is, from every other node or junction. Nodes with low closeness values have short distances from others and will be likely to be more accessible [9] . Straightness centrality idealises the competence of communication between nodes in a network. In short, these centrality measures, namely betweenness, closeness and straightness tend to respectively represent the intermediacy, accessibility and directness of a network. The improved sophistication of GIS tools has positively contributed to the field of health and health geography. The application of GIS in the health sector ranges from mapping to information systems to interactive interface development to computing the accessibility of healthcare facilities [10, 11] . Fecht et al. [12] developed a GIS-based user-controlled simulator to monitor the effect of environmental factors on human health. Nugroho Joshua et al. [13] developed a web portal for the health department of Bali to monitor the health status of the residents of Jembrana Regency. A road network can be characterised into its two basic geometric elements, node (junctions) and edges (road segments). Road networks evolve over a period of time influencing the urban activities and settlements in and around them. This in turn leads to reformations in the road structure. Thus, roads are basic urban entities that shape the evolution of a city and its urban activities. There is a rapidly advancing body of literature on the relationship between street centrality and land use. Many researchers have focussed their attention on how street centrality affects the land use pattern or the urban activities mainly, retail shops. Wang et al. [14] studied the spatial distribution pattern of retails stores in Changchun, China and their relationship with centrality indices of the road network. Location analysis of retail stores and the influence of the Central Business District (CBD) and its relationship with street centrality was explored in numerous studies. Porta et al. [15, 16] determined the spatial relationship between road structure and economic activities in Italy and found that they were highly correlated. The relationship between land-use types and road centrality was investigated for the cities of Wuhan [17] and Stockholm [18] . Literature that has focused their attention on the relationship between road centrality and healthcare facilities are mere in number [19] . Also, a weighted street centrality approach has not been explored. Thiruvananthapuram, formerly familiar by the name Trivandrum, is the capital city of the southern state of Kerala in India. It is also the headquarters of the district of Thiruvananthapuram. The scenic seaside city occupies a permanent place in the mainstream of socio-economic, religious and cultural heritage of India. Thiruvananthapuram district is topographically positioned between north latitudes 8°17 0 and 8°54 0 and east longitudes 76°41 0 and 77°17 0 . The sprawling metropolis located on the western coastal area of India is bordered by the Western Ghats to its eastern side and the Arabian sea to its western side. The city corporation is spread over 214.86 km 2 with hundred wards. In addition to being the administrative capital of Kerala and host to many government offices, it is also one of the major IT and academic hubs of India. Thiruvananthapuram district has a total population of 9,57,730 as per the Census of India 2011 [4] . The district of Thiruvananthapuram has a well-knit road network system. A major portion of the available bus services is offered by the Kerala Government's state road transport corporation along with comparatively less in number private operators. The divisional headquarters of Southern zone of the Indian Railways is in Thiruvananthapuram. Air travel is facilitated by the Thiruvananthapuram International Airport at Chakai, which is located at a convenient distance of 6.7 kilometers from East Fort. Figure 1 shows the location map of Kerala in India, the district map of the state of Kerala and the ward map of Thiruvananthapuram city. The CBD is also located. A GIS database of the healthcare facilities was initially developed. The locations of the major hospitals were obtained using GPS survey. Garmin eTrex Vista HCx handheld GPS was used for the GPS survey. The list was further expanded using Google maps. Each of the hospitals had associated attributes namely; the name, the geographical coordinates, category of the hospital, the type of medical system, whether it was government or private hospital etc. Google maps cannot be considered to have a complete database. Nevertheless, this approach had to be chosen due to the difficulty in carrying out an extensive GPS survey. Minor home-based clinics have been ignored in this study. The classification of the hospitals is presented in Fig. 2 . Health care is facilitated by both government and private agencies, with the private sector dominating the count. The healthcare system in Kerala is generally classified as a three-tier system based on the type of medical system namely allopathy, ayurveda, homeopathy. Allopathic hospitals constitute the larger part followed by ayurvedic and homeopathic hospitals respectively. An additional group consisting of other systems such as naturopathy, siddha, unani etc. has also been included in this study. Allopathy refers to the modern system of curing using drugs. Ayurveda system of medicine has its roots from the vedic culture of India and is believed to have originated over 5000 years ago. Kerala is famous all over the world for Another classification adopted in this study is based on the different levels namely primary, secondary, tertiary and quaternary levels. The primary level or the category 1 hospitals are the top hospitals in the city, both government and private-owned institutions including the research centres with the number of beds roughly greater than 100. Category 2 comprises of the hospitals with the number of beds less than 100. Category 3 includes the public health centres (PHC), eye clinics, dental clinics etc. Category 4 takes into account the dispensaries, mostly governmentowned. Categories 3, 2, 1 and 4 respectively represent the decreasing order of the number of hospitals in this classification. Medical shops were also mapped to find the locational relationship between the hospitals and the medical shops. The road network is based on the map prepared and used by the National Transportation Planning and Research Centre (NATPAC), a transport research centre in Thiruvananthapuram city. The location coordinates of the healthcare facilities obtained through GPS and google maps were fed into ArcGIS 10.2 to create a layer of the healthcare facilities. Data preparation is a stage of foremost importance as the accuracy of the analysis is directly dependent on the correctness of the data. Firstly, the road network is topologically corrected by removing the dangling nods, pseudo nodes and undershoots. It is then represented as the basic components of a directed graph i.e., edges and vertices. This is made possible by modelling the shapefile into a network dataset using the network analyst toolbox of ArcGIS. The roads are modelled in its primal representation than as dual representation considering the spatial layout and geographical properties that are preserved in the former representation. All the roads, including arterial, subarterial and collector streets are included in the analysis. Pattern analysis indicates the use of quantitative techniques for describing and analysing the distribution pattern of spatial features in a two-dimensional space and can be regarded as a precursor to structured data analysis [20] . An exploratory analysis is carried out to examine the spatial distribution of the various classes of healthcare facilities. Being precise, to get to know whether the distribution pattern is clustered or random, whether they tend to follow a specific direction or whether they are clustered around any specific point etc. Primarily two analyses are performed under the centrographic analysis: mean center and directional ellipse. The mean center is nothing but the geographic center of the features which is a point feature computed from the average of the abscissa and ordinates of the features under consideration. Standard deviational ellipse portrays the spatial spread or the directional deviation of the features. It is a polygon with the mean center representing its center and the standard distances along the two axes as the major and minor axes. Standard distance (SD), given by (Eq. 1) [21] is a term that quantifies the spatial distribution of point around the mean center. where x i and y i denotes the coordinates of point i, (X; Y) represents the ordinates of the mean center and n represents the total count of point features. Theta is referred to as the angle of rotation (Eq. 2) [22] . Rotation denotes the orientation of the ellipse. It portrays the rotation of the longitudinal axis measured clockwise from noon. The average nearest neighbour (ANN) ratio enables to assess the distribution pattern of entities as represented by Eq. 3 [20] . It uses the distance between each point entity and its closest neighbouring entity to predict if the point pattern is random, clustered or dispersed. ANN ratio is calculated as the ratio of observed average distance (d obs ) and the expected average distance (d exp ), (d exp being calculated on the basis of a hypothetical random distribution with the equal number (n) of features within an identical total area (A)). ANN ratio less than 1 reveals that the point pattern is a clustered distribution and if it exceeds 1 the pattern is more dispersed than random. A value of 1 indicates a random distribution. Z score (Eq. 4-5) [23] indicates the statistical significance of the identified pattern turning this analysis from descriptive to inferential. Higher absolute values of z score denote that they lie towards the tail of the normal distribution curve with Complete Spatial Randomness (CSR) as the null hypothesis i.e., they show a clustered pattern. Std. Error ¼ 0:26136 ffiffiffiffiffiffiffiffiffiffi The auto k-function tests if the CSR hypothesis in terms of the number of points in a given set of points that satisfy the shortest-path distance from every point to another point is less than a parametric shortest-path distance [24] . Considering a set of n points placed on a network, if n(t pi ) is the no. of points that lie within shortest-path distance t from point pi and q denote the density of points on the network. Then the K-function given by Eq. 6 [24] is Cross k-function is used to test the complete spatial randomness (CSR) hypothesis of two sets of points placed on a given road network. The CSR hypothesis assumes that the points are distributed identically and homogenously along the network, in other words, implying that the configuration of category A points do not considerably affect the distribution of category A points. The Monte Carlo technique of simulation is frequently used to examine the distribution pattern of point events that are constrained by the street network [19, 24, 25] . The distribution of these points over the road network i.e., uniformly or independently dispersed over the road network, is analysed by comparing the variation between the CSR point pattern test and the observed K-function values. If the K-functions are in the same range as that of the CSR, the points denote a random distribution. If the CSR upper bound lies below the K-function, the points are said to have a cluster distribution. Conversely, if the CSR upper curve lies above the K(l) curve, the points reveal a dispersion pattern. Additionally, the cross K-function depicts the spatial interdependence of two different categories of points say A and A. The cross K-function is used to investigate whether there is an influence of one point set over the other. It can be inferred that category A clusters around Category A if the observed curve lies above the upper bound curve. If the reverse is the case, the CSR hypothesis can be rejected i.e., the point sets tend to be dispersed from one another [25] . The results obtained from SANET, installed as an additional plug-in in ArcGIS are run using R language to obtain the following curves of K-function. Three measures of centrality are considered in this study, namely betweenness, closeness and straightness centrality given by Eqs. 8-10 [26] . These are recognized as the most important measures of centrality [18] . The street centrality approach gives weightage to the nodes based on its ease of access or the frequency with which it is traversed. Nodes can further be weighted based on their degree. For a directed graph G, degree of a node accounts to the sum total of its in-degree (i deg ) and out-degree (o deg ). In-degree is the number of incoming links attached to the node and out-degree is the number of outgoing links attached to the node as given by Eq. 7 [27] . For an undirected graph total degree is the number of links connected to it. The degree of each node is computed in ArcGIS after modelling a network dataset out of shapefile, and these weights are fed into the Urban Network Analysis toolbox developed as an extension for ArcGIS by [28] . A weighted nodal representation of the network is found to give better results as it depicts the real network in a better way [17] . The node betweenness is the fraction of shortest paths between pairs of other nodes in the network that pass by a particular node. where C_ Bet (i) r´i s the betweenness of junction i within a buffer of radius r´, njj(i) is the no. of shortest paths from node j to node j that pass by node i; and njj is the overall number of shortest paths from j to j. W[j] denotes the weight of the destination node j. Betweenness centrality essentially counts the number of shortest network paths that pass through a node or vertex. The fundamental drawback of this measure is its unstable performance in dynamic networks as the removal of a node induces significant perturbations [29] . Closeness is the inverse of cumulative distance required to reach/travel from one node to all other nodes along the shortest paths. Closeness describes how well a vertex is integrated into the network. While this property can be used to identify local centres in the network, it is not very relevant for critical locations [30] . where C_ Clos (i) r´i s the closeness of node i within a search buffer of radius r´, W[j] denotes the weight of the destination node j and d[i, j] is the shortest path distance between nodes i and j. It is the degree to which the shortest paths from a vertex or node to all other vertices or nodes in the system resemble straight Euclidean paths. where dEuc ij is the straight line or Euclidean distance between nodes i and j. This measure idealises the efficiency of communication between nodes in a network. Greater will be the efficiency when there is less deviation from the straight-line Euclidean distance. The centrality of the road links is calculated by taking the average of the two nodes connecting it. KDE is a spatial interpolation technique principally used to convert both the centrality values and the point features which are in vector format to a raster representation to facilitate cell by cell correlation. KDE is used to obtain a smooth and continuous assessment of a univariate or multivariate probability density using a distance function that measures distances in Euclidean space [31] . Moreover, this method emphasises the fact that it is not merely the street alone; the neighbourhood is also equally influential. KDE by far remains the most widely used technique for similar studies due to its numerous advantages over other techniques; primarily due to its compatibility with most GIS packages [14, 32, 33] . It also has the precedence of being superior to some of the conventional statistical tools as it considers the decay of functions based on Tobler's law [34] which states that ''Everything is related to everything else but near things are more related than distant things''. Additionally, KDE is a non-parametric smoothening technique that uses information defined by windows (also called kernels) to estimate densities of features, either points or lines, at given locations [35] . Kernel density estimator is defined by Eq. 11-12 [16] . where h is the bandwidth selected, n is the number of entities (points or lines) within the specified bandwidth, d = x ix j is the distance between the event point x j and the grid point x i . Spatial analyst module of ESRI ArcGIS uses an Epanechnikov or a quadratic kernel function formulated by [36] and given by the function Several earlier research studies have adopted KDE as a spatial interpolation technique that is used to convert the vector data points into a raster system [15, 16, 32] . A fixed bandwidth type is used in this study. The choice of bandwidth is the factor that affects the results of the interpolation than the type of interpolation technique and hence demands more focus. Bandwidths of 100 m, 200 m, 300 m, 500 m, 1000 m and 1500 m being more widely used in urban planning and similar studies, have been used in this study. In addition to this, a global bandwidth (h glo ), with an infinite radius was also considered as a means to accommodate all the nodes in the network. Thus, the computations were performed using three local and one global bandwidth. KDE layers of 9 categories of hospitals and three centrality layers for 4 bandwidths were created. Thus, a total of 84 raster layers are created for the three centrality layers and point layers. A cell size of 10 m 9 10 m is selected for this study as is consistently used for all the KDE estimations. Table 1 depicts the 84 raster layers that are created. The mean center of the eight types of healthcare facilities is computed and represented in Fig. 3 . This is performed using the mean center tool in the Spatial statistics tool of ArcGIS. The directional parameters obtained after computing the directional ellipse is represented in Table 2 . On examining the XStdDist values reported in Table 2 , it is evident that homeopathic hospitals and category 4 hospitals have a wider spread laterally followed by ayurveda and category 3. Considering all classes of hospitals, category 2 and ayurvedic hospitals have the greatest longitudinal spread. From the figure, it can be seen that the mean centers of all categories of hospitals are located far from the Central Business District and mainly concentrated towards the geographic center. The mean center of all the hospitals and all categories of hospitals are spatially very close to each other and almost overlaps with allopathy hospitals. Category 4 and other category show a slight deviation towards the northeast direction as evident from the greater rotation value. This shows an uneven distribution of healthcare facilities in the study area. It is also noteworthy that the major hospital in the district is the Thiruvananthapuram medical college which is located near the geographic centre and other hospitals are distributed in and around it. The result of ANN is summarised in Table 3 . The healthcare facilities are analysed for their spatial distribution pattern considering each category separately and all of them together. Considering all the entities together, showed a clustered pattern. On being considered separately, category 1 i.e., the primary level of healthcare facilities, Ayurvedic and homeopathic hospitals showed a random distribution. Category 4 i.e., the dispensaries and the other category of the medical system showed a dispersed pattern. Network analysis of the hospitals was performed using SANET toolbox. In Fig. 4 , observed curve indicates the k-function curve for the given set of points i.e., hospitals. The exp(Mean) curve represents the expected curve for the random spatial distribution of points. The x and y-axis respectively indicate the distance range and the cumulative number of points. The observed curve (blue) is above the upper bound curve (green) for all the distance range. This indicates a clustering of hospitals i.e.; the distribution is not even. This result is consistent with the ANN results for all the hospitals taken together. Figure 5 shows the result obtained after carrying out the Cross-k analysis between hospitals and medical shops in the area. It is seen that the observed curve (blue) lies above the upper envelope (green) for all the distance ranges indicating that medical shops show a tendency to cluster around healthcare centers' along the road network. Node centrality values are computed using the UNA toolbox and edge centralities are calculated as the average of two nodes connecting them. These values are normalised for this study and shown in Figs. 6, 7 and 8. After converting the layers to a common raster format using Kernel Density Estimation, cell by cell correlation of the KDE values is performed for all the bandwidths. A cell to cell correlation analysis was performed. All the centrality values for all the bandwidths of the cell corresponding to the hospital were extracted using the multiple values extraction tool in ArcGIS. Correlation can only be performed for layers of the same bandwidth. Pearson's correlation coefficient was computed using SPSS and the results are depicted in Table 4 . The following inferences can be drawn from the table. With increasing bandwidth, the correlation between centrality values and hospital densities becomes stronger. When all the hospitals are considered together, straightness centrality is the predominant factor that affects the spatial pattern of the hospitals on a global scale; both betweenness and closeness centrality have no significant correlation with hospital density. Category 1 hospitals display greater correlation with betweenness centrality with greater Pearson's coefficient values. Category 2 hospitals are significantly affected by centrality measures only on a global scale more specifically with straightness values indicated by higher correlation coefficient ([0.6). Distribution of Category 3 hospitals are not influenced by the road topology as indicated by the low value of correlation coefficients. Category 4 hospitals have a better relationship with closeness centrality which indicates that smaller medical facilities are related to road accessibility. Allopathy and Ayurveda hospitals are related to the straightness centrality and correlation increases with increasing bandwidth. Homeopathic hospitals are influenced by all the three centrality values when greater bandwidths are considered. When all the categories are This study can be divided into two components. The first being location analysis and the second being the effect of the road topology on the location of the health care facilities. Centrographic methods i.e., mean center and directional ellipse methods were implemented in the study area followed by ANN for finding the distribution pattern of the health care facilities in Thiruvananthapuram city. It was inferred from the centrographic analysis that the hospitals were clustered around the geographic center, far from the Central Business District. This shows that healthcare facilities are deficient towards the outer periphery of the district. Network-based location analysis was also carried out to find the distribution of health care facilities with reference to the network distance unlike the Euclidean distance considered in the former case. Cross k-function was performed to find the distribution of medical stores with respect to the hospitals. It was found out that the medical shops tend to cluster around the hospitals showing a deficiency of hospitals in other areas of the city. Correlation analysis shows that with increasing bandwidth the correlation with centrality indices increases. Category 1 hospitals are related to betweenness centrality meaning that big hospitals are related to intermediate nodes or places that are more frequently traversed. Allopathy and Ayurveda hospitals are related to the straightness centrality and correlation increases with increasing bandwidth. Thus, road topology is a significant factor that affects the distribution of healthcare facilities in the study area. However, no specific trend is visible while performing the correlation analysis. Thus, it can be concluded from the study that there is an uneven distribution of medical facilities in the city of Thiruvananthapuram. This is a serious concern which throws light to the fact that spatial allocation is not uniform and hence not all people have easy access to the medical facility. The spatial allocation of hospitals needs more defined planning to ensure equity in the medical field. However, this concern has been addressed effectively in the recent past by allocating more buildings for medical treatment in times of pandemic. Further, more indices can be incorporated to get a detailed knowledge of the road topology. An addition to this study would be the incorporation of traffic data for getting a more realistic road network. Equity in the utilization of healthcare services in india: evidence from national sample survey Kerala Government honoured by American Virology Institute for Nipah Virus Containment Kerala Wins UN Award For Outstanding Contribution Towards Control Of Non-communicable Diseases Census of India Centrality measures to identify traffic congestion on road networks: a case study of Sri Lanka Multiple centrality assessment in Parma: a network analysis of paths and open spaces Network centrality analysis of public transport systems: developing a strategic planning tool to assess passenger attraction Performance indicators for public transit connectivity in multi-modal transportation networks Measures of spatial accessibility to health care in a GIS environment: synthesis and a case study in the Chicago region. Environment and Planning B: Planning and Design Measuring spatial accessibility to healthcare services with constraint of administrative boundary: a case study of Yanqing District A GIS-based urban simulation model for environmental health analysis. Environmental Modelling and Software E-government integration through implementation of web-based GIS on community health monitoring in Jembrana Regency Location analysis of retail stores in Changchun, China: a street centrality perspective Street centrality and the location of economic activities in Barcelona Street centrality and densities of retail and services in Bologna, Italy. Environment and Planning B: Planning and Design Relationships between street centrality and land use intensity in Wuhan Exploring the relationship between street centrality and land use in Stockholm Spatial distribution characteristics of healthcare facilities in nanjing: network point pattern analysis and correlation analysis Introduction to Geographic Information System How Standard Distance works-Help|ArcGIS Desktop How Directional Distribution (Standard Deviational Ellipse) works-Help|ArcGIS Desktop How Average Nearest Neighbor works-Help|ArcGIS Desktop SANET: a toolbox for spatial analysis on a network Network-constrained and category-based point pattern analysis for Suguo retail stores in Nanjing Urban network analysis Network flows: theory, algorithms and applications. Network Urban network analysis: a new toolbox for ArcGIS Centrality indices. Network Analysis Centrality measures and vulnerability of spatial networks Using kernel density estimation to assess the spatial pattern of road density and its impact on landscape fragmentation Street centrality and land use intensity in Baton Rouge Identifying spatial patterns of retail stores in road network structure Smooth pycnophylactic interpolation for geographical regions Selection of bandwidth type and adjustment side in kernel density estimation over inhomogeneous backgrounds Density estimation for statistics and data analysis Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Acknowledgement The authors would like to acknowledge the National Transportation Planning and Research Centre (NATPAC) for sharing the required data for this study. Conflict of interest The authors of this research declare that they have no conflicts of interest.