AMPLIFY: Actionless Motion Priors for Robot Learning from Videos

Jeremy A. Collins; Lor\'and Cheng; Kunal Aneja; Albert Wilcox; Benjamin Joffe; Animesh Garg

arXiv:2506.14198·cs.RO·June 18, 2025

AMPLIFY: Actionless Motion Priors for Robot Learning from Videos

Jeremy A. Collins, Lor\'and Cheng, Kunal Aneja, Albert Wilcox, Benjamin Joffe, Animesh Garg

PDF

Open Access

TL;DR

AMPLIFY introduces a modular framework that learns accurate, generalizable robot dynamics from large-scale action-free videos and limited action-labeled data, significantly improving policy learning and video prediction.

Contribution

It decouples visual motion prediction from action inference, enabling scalable learning of dynamics from heterogeneous data sources for robotics.

Findings

01

Achieves up to 3.7x better MSE in dynamics prediction.

02

Enables 1.2-2.2x improvement in low-data policy learning.

03

Generalizes to LIBERO tasks from zero in-distribution action data.

Abstract

Action-labeled data for robotics is scarce and expensive, limiting the generalization of learned policies. In contrast, vast amounts of action-free video data are readily available, but translating these observations into effective policies remains a challenge. We introduce AMPLIFY, a novel framework that leverages large-scale video data by encoding visual dynamics into compact, discrete motion tokens derived from keypoint trajectories. Our modular approach separates visual motion prediction from action inference, decoupling the challenges of learning what motion defines a task from how robots can perform it. We train a forward dynamics model on abundant action-free videos and an inverse dynamics model on a limited set of action-labeled examples, allowing for independent scaling. Extensive evaluations demonstrate that the learned dynamics are both accurate, achieving up to 3.7x better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Robotic Path Planning Algorithms

MethodsSparse Evolutionary Training