An Evaluation of Temporal Neighborhood Coding Variants in Smartphone-Based Human Activity Recognition

da Luz, Gustavo P. C. P.; Napoli, Otávio O.; Delgado, J. V.; Rocha, Anderson R.; Boccato, Levy; Borin, Edson

doi:10.1007/978-3-031-79035-5_6

Gustavo P. C. P. da Luz⁹,
Otávio O. Napoli⁹,
J. V. Delgado⁹,
Anderson R. Rocha⁹,
Levy Boccato⁹ &
…
Edson Borin⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15414))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

377 Accesses

Abstract

Self-Supervised Learning (SSL) has emerged as a powerful tool for learning valuable representations from large amounts of unlabeled data. Human Activity Recognition (HAR) is a field that may benefit from SSL techniques, as there is a large amount of unlabeled data, and labeling is time-consuming and costly. Temporal Neighborhood Coding (TNC) is a SSL technique that shows promise in extracting meaningful features automatically. However, the effectiveness of different TNC variations for HAR data has yet to be comprehensively evaluated. Current research focuses on specific contexts, leaving a gap in understanding how these variations perform across the same dataset. Additionally, it is necessary to assess the impact of applying TNC on raw data instead of on handcrafted features. This paper systematically evaluates different variations of TNC for HAR, investigating their performances in a standardized quantitative and qualitative approach, focusing on raw data but comparing with featured engineered data of the same dataset. Our findings show that using the dilated convolution encoder proposed by TS2Vec is currently the best alternative in terms of performance, achieving 95% accuracy in the UCI dataset with raw data. Additionally, we verified that by replacing the Augmented Dickey-Fuller (ADF) statistical test with the cosine similarity to select neighboring windows makes training time seven to nine times faster with little performance impact. Finally, we found that learning from handcrafted features from UCI dataset is easier, but advanced versions of TNC can effectively learn robust features from raw data, achieving performance comparable to models trained on the handcrafted features.

Access provided by University of Notre Dame Hesburgh Library. Download conference paper PDF

Comparing self-supervised learning techniques for wearable human activity recognition

Article 30 March 2025

An active semi-supervised deep learning model for human activity recognition

Article Open access 20 March 2022

A resource conscious human action recognition framework using 26-layered deep convolutional neural network

Article 01 August 2020

1 Introduction

Self-Supervised Learning (SSL) has the community’s attention, being a powerful tool to learn valuable representations from large amounts of unlabeled data [2, 18]. Essentially, SSL approaches conceive a pretext task in which an encoder, coupled with a projection head, is encouraged to learn efficient data representations. The pretext labels are directly derived from the own input samples (originally unlabeled). After training, the encoder is detached from the projection head, and the knowledge is transferred to the downstream task coupled with a prediction head to solve the desired task. This is especially useful to learn representations of data that can be explored in multiple tasks, also saving annotation time, which is time-consuming and expensive [4].

Applying SSL to human activity recognition (HAR) data is a trend [13], due to limited size of labeled datasets, the high data requirements of models, and the possibility to collect data on a large scale [14]. This scenario motivated this study to evaluate the technique focused on time series data known as Temporal Neighborhood Coding technique (TNC) [10] for the HAR task, predicting what activity is being performed based on inertial sensor data. SSL in time series can be divided into three main categories [6, 16] : generative-based, contrastive-based, and adversarial-based methods.

The contrastive-based methods learn representation through the sample comparison of positive and negative pairs. According to Zhang et al. [16], the contrastive methods can be classified into five subcategories: prediction contrast, augmentation contrast, prototype contrast, expert knowledge contrast, and sampling contrast. In prediction contrast, the pretext task is to predict the representations for future segments/windows from previous excerpts of the time series, whereas in augmentation contrast, the representations created for augmented versions of a time window are forced to be similar, but different from those related to other distant time windows. The prototype contrast makes the clustering of garnered representations pairs. The expert knowledge contrast incorporates expert prior knowledge to choose the correct positive or negative sampling during the training. The sampling contrast uses time series segment windows and generated augmentations to compare each to check the similarity.

TNC [10] is a SSL technique based on constrastive learning paradigm. This approach aims to learn representations by comparing pairs of similar and dissimilar samples. Unlike traditional contrastive methods that use augmentations to generate positive and negative pairs, TNC defines these pairs based on the concept of temporal neighborhoods. The original version uses the Augmented Dickey-Fuller (ADF) statistical test [7] in order to identify the stationarity of a signal, and to define positive and negative pairs. The technique was originally tested with a bidirectional single-layer Recurrent Neural Network (RNN) encoder.

After the original work of Tonekaboni et al. [10], modified versions of TNC have been considered, such as TNC-sim [11], which proposed replacing the ADF statistical test by the cosine similarity metric. Additionally, the work of Retrieval-Based Reconstruction (REBAR) [12] uses TNC with a different encoder, exploring dilated convolutions proposed in TS2Vec [15] instead of a RNN. Available research works implement some of these alternatives in specific contexts, but do not establish a detailed comparison between them in the same dataset.

In this study, we assess the performance of various TNC variations. The main contributions of this work are:

1.
We evaluated the different TNC variations under identical conditions, utilizing the same code base and UCI dataset with raw data and showed that the TS2Vec encoder significantly outperforms the RNN encoder for this task, achieving accuracies that are 15 to 17% points higher. We also show that replacing the ADF statistical test by the cosine similarity has little impact on model accuracy, however, it reduces the training times from \(7 \times \) to \(9 \times \).
2.
We also evaluated the impact of applying a TNC encoder to both raw data and data represented by handcrafted features (produced and provided by the authors of the UCI [8] dataset). Our findings indicate that learning from handcrafted features is easier. Still, the more advanced versions of TNC can effectively learn robust features from the raw data, achieving performance comparable to models trained on handcrafted features.

The rest of this paper is organized as follows: Sect. 2 presents the related works. Section 3 provides an overview of the TNC technique and its variants.

Section 4 describes the material and methods and discusses the experimental results. Finally, Sect. 5 provides the main conclusions and possible future work.

2 Related Works

This section describes the related works, defining the datasets used at this paper, followed by a comparison between works that evaluated TNC performance under different circumstances such as datasets, encoders, and similarity functions.

Before discussing the TNC-related works, it is important to distinguish two variants of the UCI dataset for HAR, a dataset that is frequently used to evaluate Machine Learning (ML) models on HAR tasks. The UCI dataset, provided by the University of California, Irvine (UCI) [8], contains data extracted from smartphones’ accelerometers and gyroscope and is annotated with human activity labels (e.g., walking, laying, etc.). The first version consisted of a preprocessed dataset in which authors extracted 561 handcrafted features from the raw accelerometer and gyroscope data. We refer to this version of the UCI dataset as UCI-FE (Feature Engineered). Later, the authors disclosed an updated version that contains the raw signal from the smartphone’s accelerometers and gyroscopes. We refer to this version as UCI-Raw.

The Temporal Neighborhood Coding (TNC) technique was first introduced by Tonekaboni et al. [10], in 2021. Their work employed a bidirectional single layer RNN as the encoder and the ADF test to determine the interval around a time window where the signal can be approximately considered as stationary. They evaluated their technique on the UCI-FE dataset and reported an accuracy of 88.3%.

In 2023, Wang et al. [11] explored a variation of the TNC technique using the feature similarity of the time series measured by the cosine similarity between time windows to establish the range of a temporal neighborhood instead of ADF, selecting the most similar windows as neighbors instead of using the statistical test. They used the same encoder as Tonekaboni et al. [10], i.e., an RNN, and compared the proposed method with ADF on the sleep stages classification task. Their results suggest that the TNC-sim performs 2.81% better than TNC-adf in time domain with three classes and 2.51% with five classes.

In 2023, Xu et al. [12] evaluated a variation at the TNC technique employing a dilated convolution encoder on the UCI-Raw dataset and verified large performance gains when compared to the default RNN, reaching an accuracy of 94.3% in UCI-raw. This is significant as the dataset in this case does not contain handcrafted features and the encoder is learning to extract knowledge better than after feature engineering. The chosen encoder structure was the one employed by Yue et al. [15], that introduced the TS2Vec technique. Yue et al. [15] report significant performance improvement for TNC using their dilated convolution encoder, stating that this happens due to its architecture being more adaptable to different scales of datasets due to the receptive fields of dilated convolution.

Other works also compared the performance of RNNs against dilated convolutions. Franceschi et al. [3] use deep neural networks with exponentially dilated causal convolutions that capture long-range dependencies better than full convolutions with triplet loss employing time-based negative sampling at variable length multivariate time series, presenting the encoder as more efficient and scalable than using an RNN. Bai et al. [1] also supports this assumption, presenting results in which dilated convolutions outperformed RNN in terms of efficiency and predictive performance.

The aforementioned works evaluated TNC with different encoders (RNN and TS2Vec) and test functions (ADF and Cosine Similarity).

However, these works evaluate TNC under different, specific circumstances (ML task, dataset, etc.). Moreover, albeit some comparisons between TS2Vec and RNN are provided, such works do not carry out experiments focused on analyzing the impact of these alternatives in a clear and standard manner. Hence, in this work, we evaluate all these combinations with the same code base and on the same setup, i.e., human activity recognition using the UCI-Raw dataset.

Table 1 summarizes the TNC-based works presented in this section according to the dataset, encoder and test functions evaluated in each work.

Table 1. TNC related works

Full size table

3 TNC Technique and Its Variants

The TNC technique is based on the principle that neighboring windows in a time series are likely to belong to the same class and, therefore, should have similar representations. Conversely, distant (non-neighboring) windows are more likely to belong to different classes and should have distinct representations. To achieve this, the TNC technique trains an encoder (Enc) that encodes the time series windows and a discriminator (D) that determines whether a pair of windows are neighbors (i.e., close to each other).

Figure 1 illustrates the TNC technique. First, given a query window (\(W_q\)) that is randomly selected, and a window selector, \(W_s\) selects two extra windows: a close one (\(W_c\)), and a distant one (\(W_d\)). Then, \(W_q\), \(W_c\), and \(W_d\) are encoded using Enc into \(Z_q\), \(Z_c\), and \(Z_d\), respectively. Finally, the discriminator D is used to discriminate whether the pairs (\(Z_q\), \(Z_c\)) and (\(Z_q\), \(Z_d\)) are neighbors (close) or not.

The window selector method \(W_s\) tests whether the selected windows (i.e., \(W_c\) and \(W_d\)) are indeed similar to or different from \(W_q\). The test for \(W_c\) can be either the ADF statistical test that determines the Gaussian distribution parameters, enabling a selection of windows that follow that distribution, or to use the window with the highest cosine similarity from \(W_q\). For \(W_d\), windows are sampled randomly, ensuring a sufficient distance in the time series from \(W_q\) to be considered non-neighbors. The work of Wang et al. [11] presented the cosine similarity as a possibility for constructing non-neighborhoods, but in our case, it was just considered when forming \(W_c\).

The encoder (Enc) can be either a bidirectional RNN with one layer and 100 hidden units, as presented by Tonekaboni et al. [10], or the TS2Vec dilated convolution encoder with ten residual blocks, each block containing two one-dimensional convolutional layers with a dilation parameter, as in the works of Xu et al. [12] and Yue et al. [15]. The discriminator D is a binary classifier that receives the encoded versions of a pair of windows (e.g., \(Z_q\) and \(Z_c\)) and predicts if they are close or not. It is composed by a fully connected layer followed by a ReLU activation, a dropout layer, and another fully connected layer, where the TS2Vec discriminator includes an additional max pooling operation. Losses are calculated for close (pair \(W_q\) and \(W_c\)) and distant windows (pair \(W_q\) and \(W_d\)), including a weighting term W, inspired by Positive-Unlabeled (PU) learning, where each neighbor sample is treated as a positive example, and each non-neighbor sample is a combination of a positive example with a weight and a negative example as a complementary weight. In short, this parameter represents the probability of an unlabeled sample being a positive sample (neighbor) and can be determined by prior knowledge or learned by a hyperparameter search. The combined loss is minimized to train both the discriminator and the encoder, ensuring the model differentiates between temporally close and distant samples. After training the encoder in a self-supervised manner, the weights of the encoder are stored and can be used for the downstream task.

In this work, we evaluate different TNC variants, i.e., using the RNN and the TS2Vec encoders and using the ADF and the Cosine Similarity test functions at the window selector. To distinguish between the fours variants, we will employ the following acronyms: TNC-RNN-adf, TNC-RNN-sim, TNC-TS2Vec-adf, and TNC-TS2Vec-sim. This is the first work that evaluates all this variations of TNC using the same dataset and encoder, allowing a fair comparison between the different configurations. We also experimented with two values for W, which provides an exploration of different configurations of TNC in regards of the encoder, window selector method and weighting term.

4 Experimental Results and Discussion

This section presents our methodology and the experimental results. Initially, Sect. 4.1 describes the materials and methods used to evaluate the TNC technique. Then, Sect. 4.2 presents quantitative analysis, discussing the performance of each variant on the UCI-Raw dataset. Finally, Sect. 4.3 provides a qualitative analysis of the learned representations.

4.1 Materials and Methods

The analysis is divided into two parts. First, we replicate two of the previous works: the work of Tonekaboni et al. [10] following the code made available at Github^{Footnote 1}, in which we evaluate the TNC-RNN-adf variant using the UCI-FE dataset; and the work of Xu et al. [12], in which we evaluate the TNC-TS2Vec-sim variant using the UCI-Raw dataset, following the exact architecture proposed by the authors. This allows us to validate our code base, ensuring we execute the variants with models and parameters compatible with the related works. We also want to assess the performance impact the two variants of the UCI dataset (i.e., UCI-FE and UCI-Raw) have on TNC-RNN-adf, which has not been reported on the literature yet.

After the replication of the original results, we proceeded to the performance analysis of the four TNC variants in the case of the UCI-Raw dataset. We decided to focus on UCI-Raw because most of the publicly available HAR datasets are made available in this format and the majority of previous work on representation learning focus on learning representations from raw data. The code used in this analysis was based on the code made available by Xu et al. [12] at GitHub^{Footnote 2}. Notice that this is the same dataset of UCI, but Xu et al. [12] at UCI-Raw and Tonekaboni et al. [10] at UCI-FE uses different pre-processing steps and dataset partitions, which might contain modifications that introduce bias to the results.

The original UCI dataset contains six classes: Walking, Walking Upstairs, Walking Downstairs, Sitting, Standing, and Laying. Later, it was extended to include postural transitions (HAPT [9]). In this work we employ only the original six classes. The UCI-Raw dataset consists of records of 6 channels, being three axes of accelerometer and three of gyroscope, sampled at 50 Hz concatenated by the user and segmented in windows of 2.56 s with no overlap, corresponding to 128-time points, following the data processing steps of [12].

The selection of the window size \(\delta \) depends on prior knowledge of the signals, selected in a manner that contains just enough information of the current state of the signal. The 4,600 windows of the dataset were split, leaving 70% for training, 15% for validation, and 15% for testing, using the same partitions in all experiments. We ensured that all windows from the same user always fall into the same subset. The weight parameter W was chosen based on previous work, being set as 0.05 ( [10]) and 0.2 ( [12]). The encoding size was kept constant at 320, a batch size of 16, a learning rate of 0.00001, and 100 training epochs.

To evaluate the discriminative performance of each method at the downstream task, we used the linear readout protocol, applying logistic regression on top of the frozen representations learned by the encoder. This is made to assess the capability of the features learned by the encoder to distinguish between the classes using the metrics accuracy, area under the precision-recall curve (AUPRC), balanced accuracy, and \(F_1\)-score. Training time was measured in a machine with a 16GB Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz and an 8GB NVIDIA GTX 1080 GPU.

We also compared the variations of TNC in a qualitative manner by visualizing the generated representations in the latent space on a 2D chart with the aid of t-SNE [5]. The goal was to verify whether the dataset samples are clustered according to their classes.

4.2 Quantitative Analysis

Table 2 presents the results reported by Tonekaboni et al. [10] and Xu et al. [12], along with the results produced with our code (named as “repro”) using the same parameters they used. Notice that the accuracy and AUPRC results are very similar to the ones reported by Tonekaboni et al. and Xu et al.

Table 2. Results in percentage for replication of original works

Full size table

Table 3 shows the results for the four TNC variants with parameter W equal to 0.05 and 0.20. The first four rows contain the variants with the RNN encoder, while the last ones contain the ones with the TS2Vec encoder. The first observation is that the TS2Vec encoder significantly outperforms the RNN encoder for this task, achieving accuracies that are 15 to 17% points higher.

Table 3. Results in percentage with mean and standard deviation for different implementations of TNC on UCI Dataset Raw.

Full size table

The TNC-TS2Vec-sim variant worked best with W = 0.20 while the other alternatives worked best with W = 0.05. However, the impact on performance is very small (\(\le \)1% on accuracy).

As for using a statistical test or cosine similarity, the second one showed little impact on the metrics, which indicates that the method to select the neighborhood did not show major differences in the results for HAR data. However, one positive impact of using cosine similarity is the training time (last column), which showed to be from \(7 \times \) to \(9 \times \) faster for both encoders and weight parameters, being even faster when using RNN as the encoder. Training time is relevant, especially for works that need a fast implementation to evaluate different SSL strategies, as TNC was presented as a very time-consuming and less competitive method in works such as [12, 17], which considered TNC with RNN as the encoder as being 250 times slower than the TS2Vec framework for time series representation learning.

By analysing the second row of Table 2 (TNC-RNN-adf/\(W=0.05\) on UCI-FE) and the first row of Table 3 (TNC-RNN-adf/\(W=0.05\) on UCI-Raw) it is possible to see that it was easier for the TNC-RNN-adf to learn from the UCI-FE (Accuracy=88.0%) dataset than the UCI-Raw (Accuracy=78.7%). This might be attributed to the handcrafted features. Nonetheless, TNC-TS2Vec-sim and TNC-TS2Vec-adf were capable of learning good features from the UCI-Raw dataset, achieving 95% of accuracy. This result suggests that the TNC technique combined with the TS2Vec encoder is capable of automatically learning very discriminative features, reaching the best performance for UCI-Raw in our experiments.

To validate our findings, we performed a statistical analysis after running each one of the rows of Table 3 experiment eight times, forming a sample of 64 runs, 32 for each encoder (RNN and TS2Vec). We first assessed the normality of the data for each encoder, using Shapiro-Wilk test, that presented a p-value below 0.05 for the RNN, indicating a significant deviation from a normal distribution. Given that, we proceeded to the non-parametric paired Wilcoxon signed-rank test. The test resulted in a p-value below 0.05, which indicates that the two encoder means are statistically significant different.

4.3 Qualitative Analysis

This section provides a qualitative analysis of the learned representations.

First, we reproduce the t-SNE plots presented by Tonekaboni et al. [10]. Figure 2 shows the t-SNE plot for the test set of the UCI-FE dataset (left) and for its samples encoded by the TNC-RNN-adf variant (right). It is possible to notice that the handcrafted features already splits the data in three major clusters: (Laying), (Walking Downstairs; Walking Upstairs), and (Sitting; Standing; Walking). However, there is still some confusion among classes inside these clusters. The TCN-RNN-adf encoder, on the other hand, was capable of improving the representation by separating the classes on six, more separable, clusters, in line with the results reported by Tonekaboni et al. [10] at Appendix A.7.2.

Now, we turn our attention to the representations learned by the four variants of TNC on the UCI-Raw dataset. We start by analyzing the t-SNE chart for the raw test set, showed in Fig. 3. Notice that, despite the existence of two clusters, there is no clear separation between most classes. This explains why it is easier to learn from the UCI-FE dataset than from the UCI-Raw one.

Figure 4 shows the t-SNE charts for the representations learned by the four variants of TNC. Since the W parameter had little impact on the result, we decided to report only the best result for each one of the variants in terms of accuracy. Notice that the variants with the TS2Vec encoder (bottom) perform better than the ones with the RNN encoder (top), managing to provide a clear separation between most classes. Also, the choice of using ADF or Cosine Similarity to select the neighborhood had little impact on the clustering.

5 Conclusions and Future Work

In this work, we evaluated different variants of TNC using the feature engineered (UCI-FE) and the raw (UCI-Raw) versions of the UCI HAR dataset. UCI-FE was the version used by the original work of TNC [10], while UCI-Raw is more often used to evaluate machine learning techniques. Regarding the TNC, we evaluate the impact of two different encoders (RNN and TS2Vec) and two different neighborhood selection functions (ADF and Cos. Sim.) have on the performance of the model when trained with the UCI-Raw dataset.

Our evaluation of different TNC variants using the UCI-Raw dataset demonstrated that the TS2Vec encoder significantly outperforms the RNN encoder for this task, achieving accuracies that are 15 to 17% points higher. Additionally, we found that replacing the ADF statistical test with cosine similarity has a minimal impact on model accuracy but reduces training times from \(7 \times \) to \(9 \times \).

Furthermore, our assessment of the impact of applying a TNC encoder to both raw data (UCI-Raw) and data represented by handcrafted features (UCI-FE) indicates that, while learning from handcrafted features is easier, the more advanced versions of TNC can effectively learn robust features from raw data, achieving performance comparable to models trained on handcrafted features, i.e., 95% of accuracy.

As future work, an ablation study of different hyperparameters can be added to the discussion, specially the weight parameter and window size and the performance of the fully supervised training using the encoders with a linear classification layer can be added. Also, an analysis of the model’s performance on different training sets at the downstream task can also be included. Other HAR datasets or encoders can be used in the same analysis to assess their impact at the TNC variations. Also, other methods can be tested with different encoders in the same data to enrich the comparison. Other tasks can also be used in the intention of to evaluate TNC not in a specific downstream task such as HAR.

Reproducibility Statement. Sect. 4 defines materials and methods used in our evaluations. The code used is released publicly at GitHub^{Footnote 3}. The datasets used are publicly available, and a script containing data processing is also available.

Notes

References

Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018)
Ericsson, L., Gouk, H., Loy, C.C., Hospedales, T.M.: Self-supervised representation learning: introduction, advances, and challenges. IEEE Sig. Process. Mag. 39(3), 42–62 (2022)
Article MATH Google Scholar
Franceschi, J.Y., Dieuleveut, A., Jaggi, M.: Unsupervised scalable representation learning for multivariate time series. In: Advances in Neural Information Processing Systems (2019)
Google Scholar
Fredriksson, T., Bosch, J., Olsson, H.H.: Machine learning models for automatic labeling: a systematic literature review. In: ICSOFT, pp. 552–561 (2020)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using T-SNE. J. Mach. Learn. Res. 9(11) (2008)
Google Scholar
Meng, Q., Qian, H., Liu, Y., Xu, Y., Shen, Z., Cui, L.: Unsupervised representation learning for time series: a review. arXiv preprint arXiv:2308.01578 (2023)
Mushtaq, R.: Augmented Dickey Fuller Test (2011)
Google Scholar
Reyes-Ortiz, J.L., Anguita, D., Ghio, A., Oneto, L., Parra, X.: Human activity recognition using smartphones In: ISMSIT (2012)
Google Scholar
Reyes-Ortiz, J.L., Oneto, L., Ghio, A., Samá, A., Anguita, D., Parra, X.: Human activity recognition on smartphones with awareness of basic activities and postural transitions. In: Artificial Neural Networks and Machine Learning–ICANN 2014: 24th International Conference on Artificial Neural Networks, Hamburg, Germany, September 15-19, 2014. Proceedings 24, pp. 177–184. Springer (2014)
Google Scholar
Tonekaboni, S., Eytan, D., Goldenberg, A.: Unsupervised representation learning for time series with temporal neighborhood coding. arXiv preprint arXiv:2106.00750 (2021)
Wang, Y., Liang, H., Zhai, B.: Temporal neighborhood based self-supervised pre-training model for sleep stages classification. In: Proceedings of the 2023 15th International Conference on Bioinformatics and Biomedical Technology, pp. 149–155 (2023)
Google Scholar
Xu, M., Moreno, A., Wei, H., Marlin, B., Rehg, J.M.: Retrieval-based reconstruction for time-series contrastive learning. In: The Twelfth International Conference on Learning Representations (2023)
Google Scholar
Yin, Y., Xie, L., Jiang, Z., Xiao, F., Cao, J., Lu, S.: A systematic review of human activity recognition based on mobile devices: overview, progress and trends. IEEE Commun. Surv. Tut. 26(2) (2024)
Google Scholar
Yuan, H., et al.: Self-supervised learning for human activity recognition using 700,000 person-days of wearable data. NPJ Dig. Med. 7(1), 91 (2024)
Article MATH Google Scholar
Yue, Z., et al.: TS2Vec: towards universal representation of time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8980–8987 (2022)
Google Scholar
Zhang, K., et al.: Self-supervised learning for time series analysis: taxonomy, progress, and prospects. IEEE Trans. Pattern Anal. Mach. Intell. 46(10), 6775–6794 (2024)
Article MATH Google Scholar
Zhang, X., Zhao, Z., Tsiligkaridis, T., Zitnik, M.: Self-supervised contrastive pre-training for time series via time-frequency consistency. Adv. Neural. Inf. Process. Syst. 35, 3988–4003 (2022)
MATH Google Scholar
Zhao, Z., Alzubaidi, L., Zhang, J., Duan, Y., Gu, Y.: A comparison review of transfer learning and self-supervised learning: definitions, applications, advantages and limitations. Expert Syst. Appl., p. 122807 (2023)
Google Scholar

Download references

Acknowledgments

This project was supported by the Ministry of Science, Technology, and Innovation of Brazil, with resources granted by the Federal Law 8.248 of October 23, 1991, under the PPI-Softex [01245.003479/2024-10]. The authors also thank CNPq (315399/2023-6, 404087/2021-3 and 88887.999360/2024-00) and Fapesp (2013/08293-7) for their financial support.

Author information

Authors and Affiliations

Hub for Artificial Intelligence and Cognitive Architectures (H.IAAC), University of Campinas, Campinas, 13083-852, Brazil
Gustavo P. C. P. da Luz, Otávio O. Napoli, J. V. Delgado, Anderson R. Rocha, Levy Boccato & Edson Borin

Authors

Gustavo P. C. P. da Luz
View author publications
Search author on:PubMed Google Scholar
Otávio O. Napoli
View author publications
Search author on:PubMed Google Scholar
J. V. Delgado
View author publications
Search author on:PubMed Google Scholar
Anderson R. Rocha
View author publications
Search author on:PubMed Google Scholar
Levy Boccato
View author publications
Search author on:PubMed Google Scholar
Edson Borin
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Gustavo P. C. P. da Luz .

Editor information

Editors and Affiliations

Universidade Federal Fluminense, Niterói, Brazil
Aline Paes
Instituto Tecnológico de Aeronáutica, São José dos Campos, Brazil
Filipe A. N. Verri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

da Luz, G.P.C.P., Napoli, O.O., Delgado, J.V., Rocha, A.R., Boccato, L., Borin, E. (2025). An Evaluation of Temporal Neighborhood Coding Variants in Smartphone-Based Human Activity Recognition. In: Paes, A., Verri, F.A.N. (eds) Intelligent Systems. BRACIS 2024. Lecture Notes in Computer Science(), vol 15414. Springer, Cham. https://doi.org/10.1007/978-3-031-79035-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-79035-5_6
Published: 30 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-79034-8
Online ISBN: 978-3-031-79035-5
eBook Packages: Computer ScienceComputer Science (R0)

Keywords

Publish with us

Policies and ethics

An Evaluation of Temporal Neighborhood Coding Variants in Smartphone-Based Human Activity Recognition

Abstract

Similar content being viewed by others

Comparing self-supervised learning techniques for wearable human activity recognition

An active semi-supervised deep learning model for human activity recognition

A resource conscious human action recognition framework using 26-layered deep convolutional neural network

1 Introduction

2 Related Works

3 TNC Technique and Its Variants

4 Experimental Results and Discussion

4.1 Materials and Methods

4.2 Quantitative Analysis

4.3 Qualitative Analysis

5 Conclusions and Future Work

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Keywords

Publish with us