TL;DR
This paper introduces DRAG, a novel auto-encoding method that uses distributionally robust optimization to enhance state space coverage in goal-conditioned reinforcement learning, especially in complex visual environments.
Contribution
The paper proposes DRAG, combining $eta$-VAE with adversarial weighting and distributionally robust optimization to improve exploration and state coverage in online GCRL.
Findings
Enhanced state space coverage in maze and robotic tasks
Improved downstream control performance
No pre-training or prior environment knowledge needed
Abstract
Goal-Conditioned Reinforcement Learning (GCRL) enables agents to autonomously acquire diverse behaviors, but faces major challenges in visual environments due to high-dimensional, semantically sparse observations. In the online setting, where agents learn representations while exploring, the latent space evolves with the agent's policy, to capture newly discovered areas of the environment. However, without incentivization to maximize state coverage in the representation, classical approaches based on auto-encoders may converge to latent spaces that over-represent a restricted set of states frequently visited by the agent. This is exacerbated in an intrinsic motivation setting, where the agent uses the distribution encoded in the latent space to sample the goals it learns to master. To address this issue, we propose to progressively enforce distributional shifts towards a uniform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
