World Models as an Intermediary between Agents and the Real World
Sherry Yang

TL;DR
This paper advocates for using world models as intermediaries to improve reinforcement learning in high-cost, complex domains by addressing sample inefficiency and providing rich learning signals.
Contribution
It introduces the concept of world models as a solution to high-cost action environments and discusses strategies for their development and integration across various domains.
Findings
World models can mitigate sample inefficiency in long-horizon tasks.
They provide rich learning signals for diverse applications.
Challenges include dataset curation, architecture design, and evaluation.
Abstract
Large language model (LLM) agents trained using reinforcement learning has achieved superhuman performance in low-cost environments like games, mathematics, and coding. However, these successes have not translated to complex domains where the cost of interaction is high, such as the physical cost of running robots, the time cost of ML engineering, and the resource cost of scientific experiments. The true bottleneck for achieving the next level of agent performance for these complex and high-cost domains lies in the expense of executing actions to acquire reward signals. To address this gap, this paper argues that we should use world models as an intermediary between agents and the real world. We discuss how world models, viewed as models of dynamics, rewards, and task distributions, can overcome fundamental barriers of high-cost actions such as extreme off-policy learning and sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
