TL;DR
This paper introduces RC-aux, a lightweight auxiliary objective that improves the planning alignment of latent world models by incorporating reachability and multi-horizon prediction, enhancing goal-directed planning.
Contribution
RC-aux provides a novel, training-time correction that aligns latent representations with planning needs without altering the core model architecture.
Findings
RC-aux improves planning performance on goal-conditioned pixel-control tasks.
It enhances the latent space to better reflect reachability within planning horizons.
The approach maintains modest computational overhead.
Abstract
A latent world model may achieve accurate short-horizon prediction while still inducing a latent space that is poorly aligned with planning. A key issue is spatiotemporal mismatch: these models are often trained with local predictive supervision, but deployed for long-horizon goal-directed search in latent spaces where Euclidean distance may not reflect what is reachable within a finite action budget. We present the Reachability-Correction auxiliary objective (RC-aux), a lightweight correction for this mismatch in reconstruction-free latent world models. RC-aux keeps the world-model backbone unchanged and adds planning-aligned supervision along two axes. Along the time axis, multi-horizon open-loop prediction trains the model beyond one-step consistency. Along the space axis, budget-conditioned reachability supervision, together with temporal hard negatives, encourages the latent space…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
