A Feedback-Control Framework for Efficient Dataset Collection from In-Vehicle Data Streams
Philipp Reis, Philipp Rigoll, Christian Steinhauser, Jacob Langner, and Eric Sax

TL;DR
This paper presents FCDC, a feedback control framework that actively manages dataset collection from in-vehicle data streams, improving diversity and reducing redundancy through a closed-loop system.
Contribution
It introduces a novel feedback control paradigm for dataset collection, enabling dynamic balancing of exploration and exploitation to enhance data quality.
Findings
FCDC produces more balanced datasets by 25.9%.
Reduces data storage by 39.8%.
Demonstrates controllability on synthetic and real data streams.
Abstract
Modern AI systems are increasingly constrained not by model capacity but by the quality and diversity of their data. Despite growing emphasis on data-centric AI, most datasets are still gathered in an open-loop manner which accumulates redundant samples without feedback from the current coverage. This results in inefficient storage, costly labeling, and limited generalization. To address this, this paper introduces Feedback Control Data Collection (FCDC), a paradigm that formulates data collection as a closed-loop control problem. FCDC continuously approximates the state of the collected data distribution using an online probabilistic model and adaptively regulates sample retention using based on feedback signals such as likelihood and Mahalanobis distance. Through this feedback mechanism, the system dynamically balances exploration and exploitation, maintains dataset diversity, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Autonomous Vehicle Technology and Safety · Vehicular Ad Hoc Networks (VANETs)
