HARGPT: Are LLMs Zero-Shot Human Activity Recognizers?
Sijie Ji, Xinzhe Zheng, Chenshu Wu

TL;DR
This paper demonstrates that Large Language Models can perform zero-shot human activity recognition directly from raw IMU sensor data using strategic prompting, outperforming traditional models on benchmark datasets.
Contribution
It introduces HARGPT, a novel approach showing LLMs' capability to interpret raw sensor data for activity recognition without training, highlighting their potential in cyber-physical systems.
Findings
LLMs successfully recognize activities from raw IMU data.
HARGPT outperforms traditional machine learning and deep models.
Effective prompting enables raw data interpretation by LLMs.
Abstract
There is an ongoing debate regarding the potential of Large Language Models (LLMs) as foundational models seamlessly integrated with Cyber-Physical Systems (CPS) for interpreting the physical world. In this paper, we carry out a case study to answer the following question: Are LLMs capable of zero-shot human activity recognition (HAR). Our study, HARGPT, presents an affirmative answer by demonstrating that LLMs can comprehend raw IMU data and perform HAR tasks in a zero-shot manner, with only appropriate prompts. HARGPT inputs raw IMU data into LLMs and utilizes the role-play and think step-by-step strategies for prompting. We benchmark HARGPT on GPT4 using two public datasets of different inter-class similarities and compare various baselines both based on traditional machine learning and state-of-the-art deep classification models. Remarkably, LLMs successfully recognize human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
