InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation
Zhuoran Yang, Xi Guo, Chenjing Ding, Chiyu Wang, Wei Wu, Yanyong Zhang

TL;DR
InstaDrive introduces instance-aware mechanisms to improve the realism, temporal consistency, and spatial fidelity of driving video generation, benefiting autonomous driving applications.
Contribution
The paper presents InstaDrive, a novel framework with instance flow guiding and spatial alignment to enhance driving video realism and consistency.
Findings
Achieves state-of-the-art video generation quality.
Improves downstream autonomous driving task performance.
Enables rigorous safety evaluation with simulated scenarios.
Abstract
Autonomous driving relies on robust models trained on high-quality, large-scale multi-view driving videos. While world models offer a cost-effective solution for generating realistic driving videos, they struggle to maintain instance-level temporal consistency and spatial geometric fidelity. To address these challenges, we propose InstaDrive, a novel framework that enhances driving video realism through two key advancements: (1) Instance Flow Guider, which extracts and propagates instance features across frames to enforce temporal consistency, preserving instance identity over time. (2) Spatial Geometric Aligner, which improves spatial reasoning, ensures precise instance positioning, and explicitly models occlusion hierarchies. By incorporating these instance-aware mechanisms, InstaDrive achieves state-of-the-art video generation quality and enhances downstream autonomous driving tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Autonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications
