ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

Yurui Du; Pinhao Song; Yutong Hu; Renaud Detry

arXiv:2605.04709·cs.LG·May 7, 2026

ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

Yurui Du, Pinhao Song, Yutong Hu, Renaud Detry

PDF

TL;DR

ELVIS introduces a novel latent model predictive control method that enhances long-horizon visual planning by maintaining multiple hypotheses and stabilizing imagination, achieving state-of-the-art results in simulation and real-world tasks.

Contribution

The paper proposes ELVIS, a new approach combining Gaussian-mixture MPPI and ensemble-based uncertainty estimation to improve long-horizon visual control in model-based RL.

Findings

01

ELVIS outperforms TD-MPC2 and DreamerV3 on DeepMind Control Suite tasks.

02

ELVIS achieves zero-shot transfer to a real-world sand-spraying task.

03

ELVIS improves surface-quality metrics and robustness in occluded environments.

Abstract

A central challenge of visual control with model-based reinforcement learning (RL) is reliable long-horizon planning: long rollouts with learned latent dynamics exhibit branching futures and multi-modal action-value distributions. In addition, compounding model errors amplified by visual occlusions make deep imagination brittle. We present ELVIS, a latent model predictive controller (MPC) designed to make long-horizon planning practical. ELVIS plans in a Dreamer-style recurrent state space model (RSSM) and replaces standard unimodal model predictive path integral (MPPI) with a Gaussian-mixture MPPI that maintains multiple coherent hypotheses over long horizons, avoiding mode averaging under branching rollouts. In parallel, ELVIS stabilizes deep imagination with a shared uncertainty-aware lambda-return: an ensemble of latent critics defines an upper-confidence-bound (UCB) score that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.