Guiding Data Collection via Factored Scaling Curves
Lihan Zha, Apurva Badithela, Michael Zhang, Justin Lidard, Jeremy Bao, Emily Zhou, David Snyder, Allen Z. Ren, Dhruv Shah, Anirudha Majumdar

TL;DR
This paper introduces factored scaling curves (FSC), a method to efficiently guide data collection for imitation learning policies by understanding how performance scales with environmental factors, improving generalization in manipulation tasks.
Contribution
The paper presents a novel principled approach using factored scaling curves to optimize data collection across environmental factors, reducing costs and improving policy generalization.
Findings
Boosts success rates in real-world tasks by up to 26%.
Effectively guides data collection using offline metrics.
Enhances generalization across diverse environmental conditions.
Abstract
Generalist imitation learning policies trained on large datasets show great promise for solving diverse manipulation tasks. However, to ensure generalization to different conditions, policies need to be trained with data collected across a large set of environmental factor variations (e.g., camera pose, table height, distractors) a prohibitively expensive undertaking, if done exhaustively. We introduce a principled method for deciding what data to collect and how much to collect for each factor by constructing factored scaling curves (FSC), which quantify how policy performance varies as data scales along individual or paired factors. These curves enable targeted data acquisition for the most influential factor combinations within a given budget. We evaluate the proposed method through extensive simulated and real-world experiments, across both training-from-scratch and fine-tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Reinforcement Learning in Robotics · 3D Shape Modeling and Analysis
MethodsSparse Evolutionary Training
