1 Introduction

Mining hidden and useful patterns from data is an activity that has been strongly performed and researched for at least three decades [3]. The development of techniques for analyzing trajectory data is growing rapidly [8, 24, 27], leveraged by the popularization of GPS-equipped devices which is allowing the collection of spatial data from moving objects in unprecedented volumes. Moreover, the advancement of the Internet of Things has allowed the extraction of numerous data other than spatio-temporal called aspects. The aspects contribute to enriching the semantic dimensions of mobility data. Furthermore, the enriched dimension can be associated with a moving object, the entire trajectory, or a single trajectory point. Besides, this enriched dimension can contain any data format, from simple labels to sophisticated objects. For example, a certain point of a trajectory that is found in a restaurant may contain, in addition to the aspect that defines the type of place visited, aspects such as price range, user evaluation, and opening hours, among others. This complex type of trajectory is called Multiple Aspect Trajectory (MAT) [13].

Regarding the mobility data domain, several data mining techniques have been proposed for trajectory clustering [2, 7, 16, 20, 30, 32]. Such technique allows us to extract patterns [15], and detect common and outlying moving objects behaviors [14]. Furthermore, clustering can be used to find how moving objects are similar to each other with respect to their spatio-temporal trajectories, or trajectories of the same user are similar to each other [9]. Some clustering applications can include carpooling services based on common trajectories (e.g., Uber), profiling of users for transportation and route planning, and collaborative filtering and other association rules techniques to find which locations are often associated by their trajectories [18]. Trajectory clustering methods for space [7], spatio-temporal [31], or semantic dimensions [10] have achieved very solid results so far. Meanwhile, few works have focusing on clustering MATs [25], which is a challenging task since it requires the capability of dealing with heterogeneous data and an even bigger data volume. This means that there are opportunities for the development of new techniques capable of grouping and describing multiple semantic aspects together. Thus, it is important that clustering methods for MATs adopt multidimensional similarity metrics or other strategies that allow capturing such data heterogeneity [29].

We propose a divisive hierarchical clustering algorithm that considers the three dimensions of trajectories (space, time, and semantics) by using a decision tree-based approach. Regarding the frequency at which each of the aspects appears in the trajectories of a given dataset, the proposal seeks to iteratively select the ideal aspect to divide the set of trajectories in two by using a threshold (mean or median of aspect frequency). It leads to the formation of hierarchical trajectory clusters that are naturally more similar to each other, i.e., clusters that present a higher frequency for certain aspects and a lower frequency for others. It is noteworthy that since the proposed method uses a tree approach, there is a hierarchy between the formed clusters. Therefore, in the bottom of the tree, at the leaf nodes, we have more specific and detailed clusters, while closer to the top of the tree, clusters have more general information, being the root node the original dataset. Thus, we seek to provide a new method that supports answering questions related to trajectory similarities regarding different aspects and at different levels of abstraction.

For validation purposes, the proposed method is tested on the Foursquare dataset [29], the same data used by state-of-the-art for MAT clustering [25]. This dataset contains user trajectories (spatio-temporal) enriched with other characteristics such as: the type of place visited, rating, price tier, and weather condition. We use internal and external validation metrics to evaluate clustering results. The experimental result shows that our proposal is quantitatively and qualitatively better than the previous state-of-the-art method [25]. The clusters generated by our method were 88% more cohesive and more separable than the baseline method [25]. Regarding the external validation metrics, the proposed method was five times more precise than the baseline. From the qualitative point of view, our proposal provides different options for clustering and visualizations, being a valuable tool for data analysts performing exploratory analysis on multiple aspect trajectory data.

The remainder of this paper is organized as follows. In Sect. 2, we present the basic concepts of multiple aspect trajectories and related works. In Sect. 3 we describe the proposed approach to cluster MAT by decision trees. In Sect. 4 we discuss the experimental results and evaluation of the proposed method. Finally, the conclusion and further research directions of our work are presented in Sect. 5.

2 Basic Concepts and Related Works

In this section we present the basic concepts to guide the reader throughout this paper and a brief review of MAT clustering approaches.

2.1 Multiple Aspect Trajectory

Multiple aspect trajectories are defined by their three-dimensional nature, i.e., the sequences of points composed by space and time, in addition to the semantic dimension [13]. The concept of semantic dimension is the representation of any context information or relevant meanings that are of fundamental importance for understanding the data obtained in a trajectory [21]. The first approach that brought semantic data enrichment to trajectory data was stops and moves [1]. Moves are made of sample locations between stops which could also be at the beginning or end of a trajectory. Stops are groups of sample points close in space and time that reflect interesting spatial places known as Point of Interest (POI). Besides, every stop has a beginning and ending time, a spatial position and a minimum duration.

More recently, the semantic dimension started to represent the vast set of characteristics that each point of a trajectory can present, which is called aspect, thus bringing the idea of multiple aspect trajectories [13]. An aspect can be described as a real-world fact relevant to the analysis of moving object data. Figure 1 illustrates several points on a trajectory that can contain many aspects. It shows the trajectory of a given person where the POIs are represented in circles while different data are collect, such as: weather information, heart rate, emotional status while working at the office, the ticket price, the genre of the film and its rating in a cinema session, and a restaurant with aspects representing its reviews, opening hours, price range and the restaurant type. Thus, the multiple aspect trajectories can present numerous aspects that enrich the semantic dimension.

Fig. 1.
figure 1

Example of a Multiple Aspect Trajectory.

2.2 Related Works

Many works focus on raw trajectory analysis, and clustering due to the high availability of GPS tracking devices, enabling the tracking of moving objects such as vehicles, planes, and humans. Besides, when using clustering for MAT, it may extract useful patterns or detect interesting outliers. Nevertheless, the majority of trajectory clustering works in the literature [20, 25, 26, 28, 30, 32] take into account the space or spatio-temporal dimensions, which employ classical clustering approaches (e.g., k-means [2] and DBSCAN [22]) with adapted measures from these dimensions.

In the work of Hung et al. [7], the group proposed a framework to explore the spatio dimension by clustering and aggregating clues of trajectories to find routes. Nanni and Pedreschi [16] proposed a modified density-based clustering algorithm to explore the temporal dimension to improve the quality of trajectory clustering. Wang et al. [26] proposed a trajectory clustering method based on HDBSCAN, which adaptively clusters trajectories with their shape characteristics using the Hausdorff distance on the spatio dimension to compute similarities. Chen et al. [2] used the DBSCAN method for clustering trajectories. It first divide the trajectories into a set of subtrajectories considering the spatial dimension, then it computes the similarity between trajectories using the Hausdorff distance, and finally the DBSCAN clustering algorithm is applied for clustering. Sun and Wang [22] used the DBSCAN method combined with Minimum Bounding Rectangle and buffer similarity over the ship trajectories using the spatio dimension to improve trajectories clustering and reduce computation time.

Yuan et al. [31] proposed a density-based clustering algorithm that employs an index tree for spatio-temporal trajectories. It works by partitioning the trajectories into trajectory segments, storing them in the index tree and then the segments are clustered based on the density strategy. Yao et al. [30] proposed an RNN-based auto-encoder model to encode the spatio-temporal movement pattern of trajectory to improve similarity computation by learning the low-dimensional representation and then applied the classic k-means algorithm. Liu et al. [11] proposed a new clustering method by extending the k-means algorithm with a time layer to take into account both spatio and temporal proprieties for clustering flight trajectories.

Liu and Guo [10] proposed a semantic trajectory clustering method to capture global relationships among trajectories based on community detection from the perspective of the network. Xuhao et al. [28] proposed a semantic-based trajectory clustering method for arrival aircraft based on k-means and DBSCAN. In the work of Santos et al. [20], the researches proposed a co-clustering method for mining semantic trajectories by clustering trajectories without to test all their attributes by focusing on high frequent ones.

Different of the presented works, Varlamis et al. [25] proposed a MAT similarity measure with hierarchical clustering by including a multi-vector representation of MATs that enables performing cluster analysis on them. The vector representation embeds all trajectory dimensions (space, time, and semantic) into a low-dimension vector which allows clustering multiple aspect trajectories. The authors compared their method with other MAT similarity measures using traditional clustering methods and the results outperformed the baseline methods. Note that such comparison is performed given the gaps in this recent multiple aspect trajectory clustering task. Thus, there is a lack of new methods enabling trajectory cluster analysis using semantic dimension with its multiple aspects and the spatio-temporal dimension simultaneously. We aim to contribute to the reduction of this gap in the literature by proposing a novel MAT clustering method.

3 The Multiple Aspect Trajectory Tree Approach

In this section we present a new method named MAT-Tree (Multiple Aspect Trajectory Tree) for finding clusters in a multiple aspect trajectory dataset by using a hierarchical strategy. The main idea is that the MAT-Tree groups similar trajectories in the same cluster based on aspects that occurs frequently. MAT-Tree aims to identify the most relevant aspects while clustering the trajectories.

Figure 2 illustrates an example of a tree generated by the clustering algorithm using Sankey diagram representation. The vertical bars indicate the clusters generated by the division through the chosen aspect, noting that the leftmost bar represents the complete dataset with all trajectories and the rightmost bars represent the leaf nodes. Figure 2 illustrates tree levels with aspect RAIN as the most relevant to start the splitting where other six aspects contributed to identify four clusters (leaf nodes). This result can support analysts to identify moving patterns, for instance, during raining days there are more people that go to indoor events (e.g. movie theaters and comedies) using public transportation than night parties with stops at restaurants.

Fig. 2.
figure 2

Example of multiple aspect trajectory clustering tree.

The detailed description of MAT-Tree can be seen on Algorithm 1. The main parts and characteristics of the algorithm are: (i) the construction of the frequency matrix; (ii) the criterion chosen for splitting the dataset at each node iteration; (iii) the criterion chosen for evaluating the quality of the split tested; (iv) the stop criterion. MAT-Tree receives as input the multiple aspect trajectories MAT of dataset D, the statistical metric that will be used for dividing the trajectories stat_met, the criterion for choosing the aspect asp_criterion, the maximum height desired for the tree max_tl and the minimum number of trajectories for a leaf node min_t.

Algorithm 1
figure a

. MAT-Tree

The first node of the tree is the root and then MAT-Tree executes all the process recursively. MAT-Tree starts by creating the tree’s node structure (line 2) that will store the trajectories received as a parameter of the function (line 3). Next, both child nodes are initialized empty, called left_node (line 4) and right_node (line 5), which may contain the trajectories resulting from the division process. MAT-Tree creates the clustering tree by exploring frequency matrices regarding the occurrences of the multiple aspects of each trajectory. It means that, for each new node generated in the tree, the occurrence of each aspect in the trajectories is used to build a frequency matrix for this new node (line 6). Table 1 depicts an example of a frequency matrix showing the number of occurrences of four aspects from three MATs.

Table 1. Frequency matrix of three trajectories with four aspects.

We start by selecting a split criterion that is used to identify which aspect is better for partitioning the set of trajectories and how such partitioning will occur. Regarding this partitioning task, MAT-Tree uses a statistical metric stat_met, e.g. average, to identify the split point. In this scenario, MAT-Tree computes the average occurrence of each aspect in the set of trajectories (lines 8–11). After an aspect has been chosen for the division (line 24), the trajectories that present a frequency higher than the stat_met in the aspect are grouped together, i.e. node.right_node (lines 31–35), while the trajectories that present a frequency lower than the stat_met are grouped in another cluster, node.left_node (lines 26–30). It is noteworthy that other statistical metrics, such as the median instead of the average, can be used in this step.

We propose four evaluation criteria (\(asp\_crit\)) to select the best aspect, namely: (i) binary division (BD), (ii) minimum variance (MV), (iii) maximum variance reduction (MVR) and (iv) maximum variance reduction considering the largest reduction average among all the aspects set (LRA). The idea is to provide different approaches that better fit the data distribution and the application at hand. Therefore, MAT-Tree tests the trajectory partition for all aspects according to the chosen evaluation criterion to identify which aspect is the best choice (line 24). We provide details about each aspect evaluation criterion in the following.

BD will analyze the absolute difference in the number of trajectories between the subsets formed after partitioning. The aspect that generates the smallest difference value is selected as the best. MV evaluates the mean variance of an aspect in both trajectory partitions, selecting the one with the minimum mean variance. In addition, MVR also considers the mean variance of the aspect in both trajectory partitions. However, it evaluates the variance reduction considering the previous aspect value, that is, the variance of the parent node. The selected aspect using MVR is the one that generates the maximum reduction value compared to the previous node. Last but not least, LRA is similar to MVR, but instead of considering the max reduction of the aspects individually, LRA considers all aspects to compute the average reduction. Thus, LRA selects the aspect that generates the largest reduction average of variance regarding the whole aspects set.

We highlight that the variance represents the dispersion of the aspects trajectory distribution, that is, how far the trajectories are from the mean. Considering a low variance, we are interested in more homogeneous trajectory partitions regarding the frequency of the aspects. The proposed hierarchical tree algorithm identifies more detailed clusters regarding the tree depth, i.e., the last levels of the tree characterize clusters with more aspects on its path. It should be noted that, by design, neighboring nodes will have more similar trajectories, while distant nodes will have less similar trajectories.

In conclusion, MAT-Tree stops and returns the generated tree (line 36) when the conditions of the minimum number of trajectories or maximum tree level (or both) are satisfied. Thus, these hyperparameters may vary, and they depend on the application domain and the data distribution at hand.

4 Experimental Evaluation

We carried out the experimentsFootnote 1 using the Foursquare NY dataset [17] to evaluate the performance of different multiple aspect trajectory clustering of several users. Regarding the main hyperparameters, we use the average occurrence of the aspects frequency as the split-point strategy, while the stop criterion with the minimum number of trajectories was set to 25, where it is application-dependent. The minimum number of trajectories needs to be defined empirically for an eventual optimal value. For a fair comparison with the baseline method, we employed the default hyperparameters and set the number of clusters to be equal to the number of different users (i.e. 193 users). Moreover, we run every clustering algorithm 10 times and reports the average of internal and external evaluation scores. We performed the experiments in a machine with a processor Intel i7-7700 3.6 GHz, 16 GB of memory, and OS Windows 10 64 bits.

Petry et al. [17] enriched the original Foursquare datase [29] with semantic information to evaluate multiple aspect trajectory similarity measures. The dataset contains the trajectories of 193 different users, who accounted for a total of 66962 points in 3079 trajectories with an average length of 22 points (check-ins) per trajectory and an average of 16 trajectories per user. The aspects contained in the dataset after the enrichment are: i) the geographic coordinates (latitude and longitude, being a numerical attribute); ii) the time (numeric) at which the user checked in; iii) the day of the week (nominal); iv) the point of interest (POI), which can be understood as the name of the establishment such as Starbucks or McDonald’s (nominal); v) the type of POI (nominal), such as coffee house or fast food; vi) check-in category (nominal), called the root type (for example, Food, Outdoors & Recreation); vii) the rating assigned by the user in the application (ordinal) and viii) the weather condition (nominal) at the time of check-in.

Clustering evaluation is well-known in the literature due to the fact that clustering is an unsupervised method and we do not usually have a ground truth to compare with. However, in the trajectory clustering application, we can assume that trajectories of the same user are likely to belong to the same cluster, as indicated by Gonzalez et al. [5] and already used in the state-of-the-art [12]. Therefore, the external evaluation of the clustering method is based on this ground truth. For the internal clustering validity metrics, we assume that the best clusters are those that are well separated and compact, as described by Rendón et al. [19]. The external (supervised) and internal (unsupervised) clustering validity metrics [6] comprise: i) homogeneity score (external); ii) completeness score (external); iii) v-measure score (external); iv) adjusted mutual info (Mut. Info) score (external); v) adjusted rand (Adj. Rand) score (external); vi) Fowlkes Mallows (FM) score (external); vii) silhouette (S) score (internal); viii) Calinski Harabaz (CH) score (internal); ix) Davies-Bouldin (DB) Index (separation) (internal).

We performed the experiments using the most strongly correlated aspects as similar as the experiments conducted by Valarmis et al. [25] for comparison purpose. According to this, the aspects used are the weather, day of the week, and POI category. However, MAT-Tree is very flexible and it allows any other combination of aspects. In addition, the aspect combination also depends on the application and what kind of patterns the analyst want to mine.

Considering the aspect evaluation criterion, MAT-Tree builds four trees, one tree for each criteria. The result shows numerical similarities in terms of number of nodes, leaf nodes and height. Regarding the number of nodes, on average (rounded-down), the trees generated 388 nodes, while in terms of leaf nodes the average was 194. In addition, the heights of the trees vary between 8 and 14, as shown in Table 2. Besides, we noted that the trees structure are different among them, it indicates that the method can identify different trajectory behavior depending on the chosen criteria.

Table 2. The AVG number of generated groups (leaf-nodes) and the height of the modeled tree for each aspect selection criterion.

Figure 3 shows the clustering results from two different aspect selection criterion using the dendogram to visualize them. Figure 3(a) shows the tree generated by the MV aspect choice criterion, while Fig. 3(b) shows the result using the MVR criterion. Comparing both dendograms, it is noted that the clustering on Fig. 3(a) captured an atypical behavior of the trajectories as can be seen in the extreme right branch. In this experiment, MAT-Tree selected the Event aspect of the root_type attribute as the root node, and grouped all the trajectories of the users who checked-in for this type of aspect. This group (C1) contains 20 users out of 193 with a total of 25 trajectories. The Event aspect occurs 26 times in the universe of 66962 check-ins, which can reveal a trajectory behavior practiced only in this group.

Fig. 3.
figure 3

Clustering results generated from different aspect selection criterion.

Figure 4 illustrates the exploratory frequency analysis from the aforementioned group C1. The bar graphs show the relative frequency of aspects for root_type and day category, respectively. It means the number of occurrences of aspects in this specific cluster considering the whole set of trajectories. Furthermore, it can be seen in Fig. 4(a) that the relative frequency for Event was equivalent to its absolute frequency. That is, all trajectories that checked in the Event aspect belong to the same cluster. Regarding the days of the week, Fig. 4(b) shows that Saturday was the most frequent day in cluster C1. This behavior is expected because the events usually take place on weekends. Regarding the other aspects in Fig. 4, we noted that there are few points referring to College & University, which may indicate that the majority of users are not part of the university community. On the other hand, the users in this cluster are more related to activities involving Art & Entertainment and Nightlife, with Friday and Saturday more frequent than the other days.

Fig. 4.
figure 4

Exploratory frequency analysis of cluster C1.

We quantitatively evaluate the clustering result by employing internal and external validation metrics to verify the goodness of a clustering structure. Furthermore, we used the MSM [4] and MUITAS [17] in MAT-Tree as similarity metrics to build the similarity matrix between trajectories. We compare TraFoS [25] results with MAT-Tree because it is the state-of-the-art for MAT clustering, and TraFos outperformed the baselines on its experiments. Table 3 shows the clustering evaluation results where MAT-Tree obtained promising results. We highlight the best result in bold, while the second one is underscored for each evaluation metric. It can be seen in Table 3 that, in general, the internal and external validation scores for MAT-Tree are better than TraFoS, indicating that MAT-Tree identified better cluster structures. Additionally, the best approach for selecting an aspect to partition the set of trajectories was the maximum reduction of variance (MVR) using MUITAS.

Table 3. Internal and External Clustering Validation results.

TraFoS tested the hierarchical agglomerative clusteting with single (SL) and average (AL) linkage using the binary partition. In addition, TraFos evaluated three different variations for each clustering strategy such as: i) the first considers the average similarity (TraFoS\(_{mean}\)), ii) the second takes the maximum similarity (TraFoS\(_{max}\)) and iii) the third sets a threshold on the average similarity (TraFoS\(_{thr}\)). Comparing the silhouette scores, TraFoS obtained −0.94 and −0.95 respectively, while MAT-Tree obtained −0.284 and −0.289 respectively. It is important to note that the negative value in all cases is due to the use of similarity matrix instead of distance [25]. This means that the trajectories are better clustered together, i.e., clusters are more cohesive than those found in the previous method. Regarding external validation, in general, our results are better than the baseline, denoting that the resulting clusters are more homogeneous and complete, as shown in Table 2 and Table 3. Thus, MAT-Tree obtained a result of 88% better than TraFoS looking at Silhouette and MUITAS, while for V-measure the MAT-Tree was five times more precise.

5 Conclusion and Future Works

The volume and the variety of the big data era we live in require a high computational power which evidences the necessity of new approaches for analyzing complex data. Multiple aspect trajectories bring a lot of opportunities in the data mining domain, where the nature of a sequence, the high dimensionality, heterogeneity, and data volume pose new challenges. Even though a number of methods have been proposed for MAT classification only a few works focus on MAT clustering. Regarding this, we proposed MAT-Tree, a novel method for multiple aspect trajectory clustering based on the frequency of occurrence of the aspects of the trajectories, and using a decision tree-based approach. The proposed method presented a result of 88% better than the baseline considering internal clustering evaluation metrics and five times more precise.

The main contribution of MAT-Tree is a new approach that allows clustering trajectories considering all their dimensions and semantic aspects. It is noteworthy that studies on the semantic dimension of trajectories are recent and tailored. MAT-Tree results outperformed the state-of-the-art for multiple aspect trajectory clustering, it indicates that MAT-Tree can identify more cohesive, compact, and connected clusters. Furthermore, MAT-Tree offers different options for clustering and visualizations, providing a flexible tool for exploratory data analysis and applications that can adapt to the task at hand. Thus, once clustering is a data mining task that is inherently highly application-dependent and exploratory [23, chapter 1], the flexibility of MAT-Tree is an important characteristic.

As future works, it would be interesting to examine other aspect selection strategies that allow MAT-Tree to adapt automatically to different applications. The investigation of different split aspect strategies and evaluation criteria is a promising direction because each criterion shows to be suitable for different applications and analyses. We noted that the BD evaluation criterion is well-suited for generating clusters almost of the same size. Nonetheless, the MV criterion seems to easily find outliers trajectories (i.e., trajectories presenting aspects that rarely appear on other trajectories). Therefore, experiments designed specifically for validating these characteristics and uses of these criteria are desirable.