Coupled Local and Global World Models for Efficient First Order RL
Joseph Amigo, Rooholla Khorrambakht, Nicolas Mansard, Ludovic Righetti

TL;DR
This paper presents a novel RL training method using coupled local and global world models learned from real robot interactions, enabling efficient policy learning in complex, high-dimensional environments without simulators.
Contribution
It introduces a decoupled first-order gradient method that combines a full-scale world model with a lightweight surrogate for efficient RL training in visual environments.
Findings
Outperforms PPO in sample efficiency on manipulation tasks
Successfully applies to ego-centric object manipulation with a quadruped
Demonstrates viability of data-driven world models for complex RL tasks
Abstract
World models offer a promising avenue for more faithfully capturing complex dynamics, including contacts and non-rigidity, as well as complex sensory information, such as visual perception, in situations where standard simulators struggle. However, these models are computationally complex to evaluate, posing a challenge for popular RL approaches that have been successfully used with simulators to solve complex locomotion tasks but yet struggle with manipulation. This paper introduces a method that bypasses simulators entirely, training RL policies inside world models learned from robots' interactions with real environments. At its core, our approach enables policy training with large-scale diffusion models via a novel decoupled first-order gradient (FoG) method: a full-scale world model generates accurate forward trajectories, while a lightweight latent-space surrogate approximates its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Motion and Animation · Reinforcement Learning in Robotics
