Beyond Euclidean Proximity: Repairing Latent World Models with Horizon-Matched Trajectory Reachability Metrics
Liangyu Li, Shengzhi Wang, Qingwen Liu

TL;DR
This paper introduces trajectory reachability metrics (TRM), a novel method to improve latent world model planning by better estimating long-horizon reachability, significantly enhancing success rates in complex benchmarks.
Contribution
TRM provides a horizon-aware, post-hoc terminal-ranking method that replaces or hybridizes raw latent distance metrics, improving planning success in latent world models.
Findings
TRM reaches 97.0% success on TwoRoom benchmark, outperforming 7.0% of raw latent planning.
TRM improves baseline from 32.7% to 84.0% success across three seeds.
Mechanistic analysis shows TRM aligns better with true reachability than raw latent MSE.
Abstract
Latent world models can contain the state needed for control, yet their terminal-cost interface can expose the planner to the wrong decision-relevant information. In common latent MPC, candidate sequences are ranked by Euclidean distance between predicted terminal and goal latent states; this assumes that raw latent distance weights reachability-relevant variables correctly. We propose trajectory reachability metrics (TRM), a post-hoc terminal-ranking method for fixed latent world models. TRM trains a small pairwise head from logged trajectory structure and uses it as a replacement or hybrid cost; the encoder, dynamics, sampler, optimizer, and evaluation manifests remain fixed. The key design choice is horizon-aware supervision: the metric is trained on broad, balanced temporal separations to match the long-horizon terminal candidate ranking problem. On a hard TwoRoom benchmark, raw…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
