Combining Public Human Activity Recognition Datasets to Mitigate Labeled Data Scarcity
Riccardo Presotto, Sannara Ek, Gabriele Civitarese, Fran\c{c}ois, Portet, Philippe Lalanda, Claudio Bettini

TL;DR
This paper proposes combining multiple public human activity recognition datasets to create a generalized model that reduces the need for extensive labeled data and improves adaptation to new, unseen domains.
Contribution
It introduces a novel strategy for merging public datasets to develop pre-trained HAR models that require less labeled data for customization.
Findings
Combining datasets improves model generalization.
Significant reduction in labeled data needed for target domain.
Effective across various neural network architectures.
Abstract
The use of supervised learning for Human Activity Recognition (HAR) on mobile devices leads to strong classification performances. Such an approach, however, requires large amounts of labeled data, both for the initial training of the models and for their customization on specific clients (whose data often differ greatly from the training data). This is actually impractical to obtain due to the costs, intrusiveness, and time-consuming nature of data annotation. Moreover, even with the help of a significant amount of labeled data, model deployment on heterogeneous clients faces difficulties in generalizing well on unseen data. Other domains, like Computer Vision or Natural Language Processing, have proposed the notion of pre-trained models, leveraging large corpora, to reduce the need for annotated data and better manage heterogeneity. This promising approach has not been implemented in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
