InDRiVE: Reward-Free World-Model Pretraining for Autonomous Driving via Latent Disagreement
Feeza Khan Khanzada, Jaerock Kwon

TL;DR
This paper introduces InDRiVE, a reward-free pretraining method for autonomous driving using latent disagreement as intrinsic motivation, leading to improved zero-shot transfer and few-shot adaptation in unseen environments.
Contribution
InDRiVE is the first to leverage latent ensemble disagreement as an intrinsic motivation for reward-free pretraining in autonomous driving, enhancing transfer robustness.
Findings
Improved zero-shot robustness in unseen towns and routes.
Enhanced few-shot collision avoidance under town shift.
Disagreement-based pretraining outperforms reward-based methods.
Abstract
Model-based reinforcement learning (MBRL) can reduce interaction cost for autonomous driving by learning a predictive world model, but it typically still depends on task-specific rewards that are difficult to design and often brittle under distribution shift. This paper presents InDRiVE, a DreamerV3-style MBRL agent that performs reward-free pretraining in CARLA using only intrinsic motivation derived from latent ensemble disagreement. Disagreement acts as a proxy for epistemic uncertainty and drives the agent toward under-explored driving situations, while an imagination-based actor-critic learns a planner-free exploration policy directly from the learned world model. After intrinsic pretraining, we evaluate zero-shot transfer by freezing all parameters and deploying the pretrained exploration policy in unseen towns and routes. We then study few-shot adaptation by training a task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
