Fusion4CA: Boosting 3D Object Detection via Comprehensive Image Exploitation

Kang Luo; Xin Chen; Yangyi Xiao; Hesheng Wang

arXiv:2603.05305·cs.CV·March 6, 2026

Fusion4CA: Boosting 3D Object Detection via Comprehensive Image Exploitation

Kang Luo, Xin Chen, Yangyi Xiao, Hesheng Wang

PDF

Open Access

TL;DR

Fusion4CA enhances 3D object detection by fully exploiting RGB data alongside LiDAR, using novel modules for better feature alignment and integration, leading to improved accuracy with minimal additional computational cost.

Contribution

The paper introduces Fusion4CA, a plug-and-play framework that significantly improves 3D detection by leveraging RGB data more effectively than prior methods.

Findings

01

Achieves 69.7% mAP on nuScenes with only 6 training epochs

02

Improves baseline performance by 1.2% mAP

03

Adds minimal 3.48% inference parameters

Abstract

Nowadays, an increasing number of works fuse LiDAR and RGB data in the bird's-eye view (BEV) space for 3D object detection in autonomous driving systems. However, existing methods suffer from over-reliance on the LiDAR branch, with insufficient exploration of RGB information. To tackle this issue, we propose Fusion4CA, which is built upon the classic BEVFusion framework and dedicated to fully exploiting visual input with plug-and-play components. Specifically, a contrastive alignment module is designed to calibrate image features with 3D geometry, and a camera auxiliary branch is introduced to mine RGB information sufficiently during training. For further performance enhancement, we leverage an off-the-shelf cognitive adapter to make the most of pretrained image weights, and integrate a standard coordinate attention module into the fusion stage as a supplementary boost. Experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques