TL;DR
This paper introduces a novel exploration method for 3D environments using persistent world modeling and episodic context, significantly improving exploration efficiency and generalization in complex, photorealistic settings.
Contribution
The authors propose a combined approach of online 3D reconstruction and sequence modeling of RGB observations to enhance curiosity-driven exploration in 3D environments.
Findings
Outperforms RL-based active mapping baselines in HM3D.
Generalizes zero-shot to Gibson and AI-generated worlds.
Enables effective downstream task adaptation, such as apple picking.
Abstract
Exploration is a prerequisite for learning useful behaviors in sparse-reward, long-horizon tasks, particularly within 3D environments. Curiosity-driven reinforcement learning addresses this via intrinsic rewards derived from the mismatch between the agent's predictive model of the world and reality. However, translating this intrinsic motivation to complex, photorealistic environments remains difficult, as agents can become trapped in local loops and receive fresh rewards for revisiting forgotten states. In this work, we demonstrate that this failure stems from a lack of spatial persistence and episodic context. We show that effective curiosity requires a model of the world that is persistent and continuously updated, paired with an agent that maintains an episodic trajectory history to navigate toward novel regions. We achieve this using an online 3D reconstruction as a persistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
