DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion

Zhigang Sun; Yiru Wang; Anqing Jiang; Shuo Wang; Yu Gao; Yuwen Heng; Shouyi Zhang; An He; Hao Jiang; Jinhao Chai; Zichong Gu; Wang Jijun; Shichen Tang; Lavdim Halilaj; Juergen Luettin; Hao Sun

arXiv:2508.01778·cs.CV·August 5, 2025

DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion

Zhigang Sun, Yiru Wang, Anqing Jiang, Shuo Wang, Yu Gao, Yuwen Heng, Shouyi Zhang, An He, Hao Jiang, Jinhao Chai, Zichong Gu, Wang Jijun, Shichen Tang, Lavdim Halilaj, Juergen Luettin, Hao Sun

PDF

Open Access

TL;DR

DiffSemanticFusion introduces a novel fusion framework combining raster and graph-based map representations with a diffusion module, significantly improving trajectory prediction and autonomous driving performance in real-world benchmarks.

Contribution

The paper proposes DiffSemanticFusion, a new multimodal fusion framework with a map diffusion module that enhances online HD map stability and expressiveness for autonomous driving.

Findings

01

Achieves 5.1% performance improvement on nuScenes trajectory prediction.

02

Attains 15% performance gain in NavHard scenarios for autonomous driving.

03

Map diffusion module can be integrated into other vector-based approaches to boost performance.

Abstract

Autonomous driving requires accurate scene understanding, including road geometry, traffic agents, and their semantic relationships. In online HD map generation scenarios, raster-based representations are well-suited to vision models but lack geometric precision, while graph-based representations retain structural detail but become unstable without precise maps. To harness the complementary strengths of both, we propose DiffSemanticFusion -- a fusion framework for multimodal trajectory prediction and planning. Our approach reasons over a semantic raster-fused BEV space, enhanced by a map diffusion module that improves both the stability and expressiveness of online HD map representations. We validate our framework on two downstream tasks: trajectory prediction and planning-oriented end-to-end autonomous driving. Experiments on real-world autonomous driving benchmarks, nuScenes and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Automated Road and Building Extraction · Advanced Neural Network Applications