Don't freeze: Finetune encoders for better Self-Supervised HAR
Vitor Fortes Rey, Dominique Nshimyimana, Paul Lukowicz

TL;DR
This paper demonstrates that fine-tuning encoders instead of freezing them in self-supervised human activity recognition significantly improves classification performance across multiple datasets and tasks, especially with less labeled data.
Contribution
It introduces a simple modification—not freezing encoders—that consistently enhances self-supervised HAR performance across various datasets and pretext tasks.
Findings
Performance improved across all datasets and tasks.
Effect inversely related to amount of labeled data.
Improvement observed in both pretext and target datasets.
Abstract
Recently self-supervised learning has been proposed in the field of human activity recognition as a solution to the labelled data availability problem. The idea being that by using pretext tasks such as reconstruction or contrastive predictive coding, useful representations can be learned that then can be used for classification. Those approaches follow the pretrain, freeze and fine-tune procedure. In this paper we will show how a simple change - not freezing the representation - leads to substantial performance gains across pretext tasks. The improvement was found in all four investigated datasets and across all four pretext tasks and is inversely proportional to amount of labelled data. Moreover the effect is present whether the pretext task is carried on the Capture24 dataset or directly in unlabelled data of the target dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Machine Learning and Data Classification
