SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing
Pengrui Quan, Xiaomin Ouyang, Jeya Vikranth Jeyakumar, Ziqi Wang, Yang, Xing, Mani Srivastava

TL;DR
SensorBench evaluates the capabilities of Large Language Models in processing sensor data, revealing their strengths and limitations across diverse real-world tasks, and explores prompting strategies to enhance performance.
Contribution
This paper introduces SensorBench, a comprehensive benchmark for assessing LLMs in sensor data processing, and analyzes prompting strategies to improve their effectiveness.
Findings
LLMs perform well on simple sensor tasks
Challenges remain in complex, compositional tasks
Self-verification prompts outperform other methods in 48% of cases
Abstract
Effective processing, interpretation, and management of sensor data have emerged as a critical component of cyber-physical systems. Traditionally, processing sensor data requires profound theoretical knowledge and proficiency in signal-processing tools. However, recent works show that Large Language Models (LLMs) have promising capabilities in processing sensory data, suggesting their potential as copilots for developing sensing systems. To explore this potential, we construct a comprehensive benchmark, SensorBench, to establish a quantifiable objective. The benchmark incorporates diverse real-world sensor datasets for various tasks. The results show that while LLMs exhibit considerable proficiency in simpler tasks, they face inherent challenges in processing compositional tasks with parameter selections compared to engineering experts. Additionally, we investigate four prompting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
