RESAR-BEV: An Explainable Progressive Residual Autoregressive Approach for Camera-Radar Fusion in BEV Segmentation
Zhiwen Zeng, Yunfei Yin, Zheng Yuan, Argho Dey, and Xianjian Bao

TL;DR
RESAR-BEV introduces an explainable, progressive residual autoregressive framework for BEV segmentation that improves accuracy, robustness, and interpretability in autonomous driving scenarios.
Contribution
It presents a novel progressive refinement approach with residual autoregressive learning and dual-path encoding, advancing beyond single-step end-to-end methods.
Findings
Achieves 54.0% mIoU on nuScenes benchmark.
Maintains real-time performance at 14.6 FPS.
Demonstrates robustness in adverse weather and long-range perception.
Abstract
Bird's-Eye-View (BEV) semantic segmentation provides comprehensive environmental perception for autonomous driving but suffers multi-modal misalignment and sensor noise. We propose RESAR-BEV, a progressive refinement framework that advances beyond single-step end-to-end approaches: (1) progressive refinement through residual autoregressive learning that decomposes BEV segmentation into interpretable coarse-to-fine stages via our Drive-Transformer and Modifier-Transformer residual prediction cascaded architecture, (2) robust BEV representation combining ground-proximity voxels with adaptive height offsets and dual-path voxel feature encoding (max+attention pooling) for efficient feature extraction, and (3) decoupled supervision with offline Ground Truth decomposition and online joint optimization to prevent overfitting while ensuring structural coherence. Experiments on nuScenes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Visual Attention and Saliency Detection
