Flash-Fusion: Enabling Expressive, Low-Latency Queries on IoT Sensor Streams with LLMs

Kausar Patherya; Ashutosh Dhekne; Francisco Romero

arXiv:2511.11885·cs.DC·November 18, 2025

Flash-Fusion: Enabling Expressive, Low-Latency Queries on IoT Sensor Streams with LLMs

Kausar Patherya, Ashutosh Dhekne, Francisco Romero

PDF

Open Access

TL;DR

Flash-Fusion is a system that enables low-latency, cost-effective natural language querying of IoT sensor streams by combining edge summarization and cloud-based query planning, making IoT data analysis more accessible and efficient.

Contribution

It introduces an end-to-end edge-cloud system that significantly reduces data volume, latency, and token costs for LLM-based IoT data analysis, with novel data summarization and query planning techniques.

Findings

01

Achieves 73.5% data reduction through edge summarization.

02

Reduces query latency by 95% compared to raw data approaches.

03

Decreases token usage and cost by 98%, maintaining high-quality responses.

Abstract

Smart cities and pervasive IoT deployments have generated interest in IoT data analysis across transportation and urban planning. At the same time, Large Language Models offer a new interface for exploring IoT data - particularly through natural language. Users today face two key challenges when working with IoT data using LLMs: (1) data collection infrastructure is expensive, producing terabytes of low-level sensor readings that are too granular for direct use, and (2) data analysis is slow, requiring iterative effort and technical expertise. Directly feeding all IoT telemetry to LLMs is impractical due to finite context windows, prohibitive token costs at scale, and non-interactive latencies. What is missing is a system that first parses a user's query to identify the analytical task, then selects the relevant data slices, and finally chooses the right representation before invoking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Mobility and Location-Based Analysis · Mobile Crowdsensing and Crowdsourcing · Traffic Prediction and Management Techniques