GIRL: Generative Imagination Reinforcement Learning via Information-Theoretic Hallucination Control
Prakul Sunil Hiremath

TL;DR
GIRL introduces a novel latent world-model framework that enhances model-based reinforcement learning by anchoring imagined trajectories to a semantically consistent space and restricting imagination drift, leading to improved performance on long-horizon tasks.
Contribution
The paper proposes GIRL, a new approach combining a foundation model-based grounding signal and an adaptive trust-region to improve long-horizon planning in model-based RL.
Findings
GIRL reduces latent rollout drift by 38-61% across tasks.
GIRL improves asymptotic return and reduces environment interactions.
GIRL outperforms existing methods like DreamerV3 and TD-MPC2 on benchmark tasks.
Abstract
Model-based reinforcement learning (MBRL) improves sample efficiency by optimizing policies inside imagined rollouts, but long-horizon planning degrades when model errors compound and imagined trajectories drift off the training manifold. We introduce GIRL (Generative Imagination Reinforcement Learning), a latent world-model framework that addresses this failure mode with two key components. First, a cross-modal grounding signal derived from a frozen foundation model (DINOv2) anchors the latent transition prior to a semantically consistent embedding space, penalizing inconsistent or implausible predictions. Second, an uncertainty-adaptive trust-region bottleneck interprets the KL regularizer as the Lagrange multiplier of a constrained optimization problem, restricting imagination drift within a learned region calibrated by Expected Information Gain and a Relative Performance Loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
