Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons
Anthony Liang, Yigit Korkmaz, Jiahui Zhang, Minyoung Hwang, Abrar Anwar, Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Biyik, Jesse Zhang

TL;DR
Robometer is a scalable framework for robotic reward modeling that combines local progress supervision with global trajectory preferences, enabling better learning from diverse and suboptimal data.
Contribution
It introduces Robometer, a novel reward modeling approach that effectively leverages large-scale, diverse trajectory data including failures, and curates the RBM-1M dataset for training.
Findings
Robometer outperforms prior methods in generalization across benchmarks.
It improves robot learning performance on various downstream tasks.
The RBM-1M dataset contains over one million diverse trajectories, including failures.
Abstract
General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations, providing only local, frame-level supervision. While effective for expert demonstrations, this paradigm scales poorly to large-scale robotics datasets where failed and suboptimal trajectories are abundant and assigning dense progress labels is ambiguous. We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision. Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints across trajectories of the same task, enabling effective learning from both real and augmented failed trajectories. To support this formulation at scale, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
