Quality over Quantity: Demonstration Curation via Influence Functions for Data-Centric Robot Learning
Haeone Lee, Taywon Min, Junsu Kim, Sinjae Kang, Fangchen Liu, Lerrel Pinto, Kimin Lee

TL;DR
This paper introduces a systematic data curation method called QoQ that uses influence functions to identify high-quality demonstrations, significantly improving robot policy performance by focusing on impactful training samples.
Contribution
The paper develops a novel influence function-based approach for data curation in robot learning, specifically tailored to demonstration data, enhancing data quality assessment and selection.
Findings
QoQ improves policy performance over prior methods.
Influence functions effectively identify impactful training samples.
Adaptations of influence functions reduce noise and enhance data coverage.
Abstract
Learning from demonstrations has emerged as a promising paradigm for end-to-end robot control, particularly when scaled to diverse and large datasets. However, the quality of demonstration data, often collected through human teleoperation, remains a critical bottleneck for effective data-driven robot learning. Human errors, operational constraints, and teleoperator variability introduce noise and suboptimal behaviors, making data curation essential yet largely manual and heuristic-driven. In this work, we propose Quality over Quantity (QoQ), a grounded and systematic approach to identifying high-quality data by defining data quality as the contribution of each training sample to reducing loss on validation demonstrations. To efficiently estimate this contribution, we leverage influence functions, which quantify the impact of individual training samples on model performance. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
