AMPLIFY: Actionless Motion Priors for Robot Learning from Videos
Jeremy A. Collins, Lor\'and Cheng, Kunal Aneja, Albert Wilcox, Benjamin Joffe, Animesh Garg

TL;DR
AMPLIFY introduces a modular framework that learns accurate, generalizable robot dynamics from large-scale action-free videos and limited action-labeled data, significantly improving policy learning and video prediction.
Contribution
It decouples visual motion prediction from action inference, enabling scalable learning of dynamics from heterogeneous data sources for robotics.
Findings
Achieves up to 3.7x better MSE in dynamics prediction.
Enables 1.2-2.2x improvement in low-data policy learning.
Generalizes to LIBERO tasks from zero in-distribution action data.
Abstract
Action-labeled data for robotics is scarce and expensive, limiting the generalization of learned policies. In contrast, vast amounts of action-free video data are readily available, but translating these observations into effective policies remains a challenge. We introduce AMPLIFY, a novel framework that leverages large-scale video data by encoding visual dynamics into compact, discrete motion tokens derived from keypoint trajectories. Our modular approach separates visual motion prediction from action inference, decoupling the challenges of learning what motion defines a task from how robots can perform it. We train a forward dynamics model on abundant action-free videos and an inverse dynamics model on a limited set of action-labeled examples, allowing for independent scaling. Extensive evaluations demonstrate that the learned dynamics are both accurate, achieving up to 3.7x better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Robotic Path Planning Algorithms
MethodsSparse Evolutionary Training
