Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing Data
Aritra Hota, Soumyajit Chatterjee, Sandip Chakraborty

TL;DR
This paper explores using large language models as virtual annotators for time-series physical sensing data, aiming to improve annotation efficiency and privacy without relying on additional modalities.
Contribution
The study demonstrates that encoding raw sensor data with SSL techniques enables LLMs to accurately annotate time-series data, reducing the need for fine-tuning and complex prompts.
Findings
SSL-based encoding improves LLM annotation accuracy
Metric-guided prompting enhances decision quality
Approach outperforms traditional human-in-the-loop methods
Abstract
Traditional human-in-the-loop-based annotation for time-series data like inertial data often requires access to alternate modalities like video or audio from the environment. These alternate sources provide the necessary information to the human annotator, as the raw numeric data is often too obfuscated even for an expert. However, this traditional approach has many concerns surrounding overall cost, efficiency, storage of additional modalities, time, scalability, and privacy. Interestingly, recent large language models (LLMs) are also trained with vast amounts of publicly available alphanumeric data, which allows them to comprehend and perform well on tasks beyond natural language processing. Naturally, this opens up a potential avenue to explore LLMs as virtual annotators where the LLMs will be directly provided the raw sensor data for annotation instead of relying on any alternate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTime Series Analysis and Forecasting · Advanced Text Analysis Techniques · Data Visualization and Analytics
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam
