In-Context Reinforcement Learning via Communicative World Models
Fernando Martinez-Lopez, Tao Li, Yingdong Lu, Juntao Chen

TL;DR
This paper introduces CORAL, a framework that uses emergent communication between a pre-trained world model and a control agent to improve in-context reinforcement learning, enabling better generalization and zero-shot adaptation in new environments.
Contribution
The work presents a novel communicative framework that decouples representation learning from control, allowing transfer of knowledge via emergent communication in RL agents.
Findings
Enables zero-shot adaptation in unseen environments.
Improves sample efficiency of RL agents.
Demonstrates effective emergent communication protocol.
Abstract
Reinforcement learning (RL) agents often struggle to generalize to new tasks and contexts without updating their parameters, mainly because their learned representations and policies are overfit to the specifics of their training environments. To boost agents' in-context RL (ICRL) ability, this work formulates ICRL as a two-agent emergent communication problem and introduces CORAL (Communicative Representation for Adaptive RL), a framework that learns a transferable communicative context by decoupling latent representation learning from control. In CORAL, an Information Agent (IA) is pre-trained as a world model on a diverse distribution of tasks. Its objective is not to maximize task reward, but to build a world model and distill its understanding into concise messages. The emergent communication protocol is shaped by a novel Causal Influence Loss, which measures the effect that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
