Evaluating Large Language Models as Virtual Annotators for Time-series   Physical Sensing Data

Aritra Hota; Soumyajit Chatterjee; Sandip Chakraborty

arXiv:2403.01133·cs.LG·September 24, 2024·2 cites

Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing Data

Aritra Hota, Soumyajit Chatterjee, Sandip Chakraborty

PDF

Open Access

TL;DR

This paper explores using large language models as virtual annotators for time-series physical sensing data, aiming to improve annotation efficiency and privacy without relying on additional modalities.

Contribution

The study demonstrates that encoding raw sensor data with SSL techniques enables LLMs to accurately annotate time-series data, reducing the need for fine-tuning and complex prompts.

Findings

01

SSL-based encoding improves LLM annotation accuracy

02

Metric-guided prompting enhances decision quality

03

Approach outperforms traditional human-in-the-loop methods

Abstract

Traditional human-in-the-loop-based annotation for time-series data like inertial data often requires access to alternate modalities like video or audio from the environment. These alternate sources provide the necessary information to the human annotator, as the raw numeric data is often too obfuscated even for an expert. However, this traditional approach has many concerns surrounding overall cost, efficiency, storage of additional modalities, time, scalability, and privacy. Interestingly, recent large language models (LLMs) are also trained with vast amounts of publicly available alphanumeric data, which allows them to comprehend and perform well on tasks beyond natural language processing. Naturally, this opens up a potential avenue to explore LLMs as virtual annotators where the LLMs will be directly provided the raw sensor data for annotation instead of relying on any alternate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Advanced Text Analysis Techniques · Data Visualization and Analytics

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Multi-Head Attention · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam