Physics-informed Diffusion Mamba Transformer for Real-world Driving
Hang Zhou, Qiang Zhang, Peiran Liu, Yihao Qin, Zhaoxu Yan, Yiding Ji

TL;DR
This paper presents a novel diffusion transformer model for autonomous driving that effectively captures temporal dependencies and physical laws, improving trajectory prediction accuracy and safety.
Contribution
It introduces a diffusion Mamba Transformer architecture with integrated physical constraints, advancing multi-modal motion prediction in autonomous driving.
Findings
Outperforms state-of-the-art models in accuracy
Enhances physical plausibility of predictions
Improves robustness in diverse driving scenarios
Abstract
Autonomous driving systems demand trajectory planners that not only model the inherent uncertainty of future motions but also respect complex temporal dependencies and underlying physical laws. While diffusion-based generative models excel at capturing multi-modal distributions, they often fail to incorporate long-term sequential contexts and domain-specific physical priors. In this work, we bridge these gaps with two key innovations. First, we introduce a Diffusion Mamba Transformer architecture that embeds mamba and attention into the diffusion process, enabling more effective aggregation of sequential input contexts from sensor streams and past motion histories. Second, we design a Port-Hamiltonian Neural Network module that seamlessly integrates energy-based physical constraints into the diffusion model, thereby enhancing trajectory predictions with both consistency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Human Motion and Animation · Generative Adversarial Networks and Image Synthesis
