LiteFusion: Taming 3D Object Detectors from Vision-Based to Multi-Modal with Minimal Adaptation

Xiangxuan Ren; Zhongdao Wang; Pin Tang; Guoqing Wang; Jilai Zheng; Chao Ma

arXiv:2512.20217·cs.CV·December 24, 2025

LiteFusion: Taming 3D Object Detectors from Vision-Based to Multi-Modal with Minimal Adaptation

Xiangxuan Ren, Zhongdao Wang, Pin Tang, Guoqing Wang, Jilai Zheng, Chao Ma

PDF

Open Access

TL;DR

LiteFusion is a novel multi-modal 3D object detector that enhances camera-based detection with minimal adaptation by integrating LiDAR data as geometric information, improving accuracy and robustness without complex architectures.

Contribution

It introduces a simple, deployment-friendly fusion approach that eliminates the need for a 3D backbone, leveraging LiDAR as a complementary geometric source within a quaternion space.

Findings

01

Improves baseline vision detector by +20.4% mAP and +19.7% NDS on nuScenes.

02

Maintains strong performance even without LiDAR input, demonstrating robustness.

03

Requires only 1.1% additional parameters, ensuring efficiency.

Abstract

3D object detection is fundamental for safe and robust intelligent transportation systems. Current multi-modal 3D object detectors often rely on complex architectures and training strategies to achieve higher detection accuracy. However, these methods heavily rely on the LiDAR sensor so that they suffer from large performance drops when LiDAR is absent, which compromises the robustness and safety of autonomous systems in practical scenarios. Moreover, existing multi-modal detectors face difficulties in deployment on diverse hardware platforms, such as NPUs and FPGAs, due to their reliance on 3D sparse convolution operators, which are primarily optimized for NVIDIA GPUs. To address these challenges, we reconsider the role of LiDAR in the camera-LiDAR fusion paradigm and introduce a novel multi-modal 3D detector, LiteFusion. Instead of treating LiDAR point clouds as an independent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Autonomous Vehicle Technology and Safety