Physics-informed Diffusion Mamba Transformer for Real-world Driving

Hang Zhou; Qiang Zhang; Peiran Liu; Yihao Qin; Zhaoxu Yan; Yiding Ji

arXiv:2602.00808·cs.RO·February 3, 2026

Physics-informed Diffusion Mamba Transformer for Real-world Driving

Hang Zhou, Qiang Zhang, Peiran Liu, Yihao Qin, Zhaoxu Yan, Yiding Ji

PDF

Open Access

TL;DR

This paper presents a novel diffusion transformer model for autonomous driving that effectively captures temporal dependencies and physical laws, improving trajectory prediction accuracy and safety.

Contribution

It introduces a diffusion Mamba Transformer architecture with integrated physical constraints, advancing multi-modal motion prediction in autonomous driving.

Findings

01

Outperforms state-of-the-art models in accuracy

02

Enhances physical plausibility of predictions

03

Improves robustness in diverse driving scenarios

Abstract

Autonomous driving systems demand trajectory planners that not only model the inherent uncertainty of future motions but also respect complex temporal dependencies and underlying physical laws. While diffusion-based generative models excel at capturing multi-modal distributions, they often fail to incorporate long-term sequential contexts and domain-specific physical priors. In this work, we bridge these gaps with two key innovations. First, we introduce a Diffusion Mamba Transformer architecture that embeds mamba and attention into the diffusion process, enabling more effective aggregation of sequential input contexts from sensor streams and past motion histories. Second, we design a Port-Hamiltonian Neural Network module that seamlessly integrates energy-based physical constraints into the diffusion model, thereby enhancing trajectory predictions with both consistency and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Human Motion and Animation · Generative Adversarial Networks and Image Synthesis