TL;DR
HMPDM is a novel diffusion-based video prediction model for autonomous driving that uses historical motion priors to improve temporal coherence and visual quality, outperforming existing methods.
Contribution
The paper introduces HMPDM, a diffusion model with three key modules that leverage historical motion priors for enhanced driving scene prediction.
Findings
Achieves 28.2% improvement in FVD on Cityscapes benchmark.
Outperforms state-of-the-art methods in video prediction accuracy.
Demonstrates efficiency and stability in iterative denoising.
Abstract
Video prediction is a useful function for autonomous driving, enabling intelligent vehicles to reliably anticipate how driving scenes will evolve and thereby supporting reasoning and safer planning. However, existing models are constrained by multi-stage training pipelines and remain insufficient in modeling the diverse motion patterns in real driving scenes, leading to degraded temporal consistency and visual quality. To address these challenges, this paper introduces the historical motion priors-informed diffusion model (HMPDM), a video prediction model that leverages historical motion priors to enhance motion understanding and temporal coherence. The proposed deep learning system introduces three key designs: (i) a Temporal-aware Latent Conditioning (TaLC) module for implicit historical motion injection; (ii) a Motion-aware Pyramid Encoder (MaPE) for multi-scale motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
