Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation

Rui Yu; Runkai Zhao; Jiagen Li; Qingsong Zhao; HuaiCheng Yan; Meng Wang

arXiv:2409.11018·cs.CV·March 31, 2026

Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation

Rui Yu, Runkai Zhao, Jiagen Li, Qingsong Zhao, HuaiCheng Yan, Meng Wang

PDF

TL;DR

This paper introduces FASD, a framework for efficient LiDAR 3D object detection that uses cross-model knowledge distillation to enhance Mamba models with Transformer capabilities, improving accuracy and reducing resource use.

Contribution

The work presents a novel architecture for cross-model knowledge distillation that effectively transfers Transformer features to Mamba models for real-time LiDAR detection.

Findings

01

Achieved 4x reduction in resource consumption on Waymo and nuScenes datasets.

02

Improved detection accuracy by 1-2% over baseline models.

03

Demonstrated significant gains in efficiency and accuracy in real-world deployment.

Abstract

The LiDAR 3D object detector that strikes a balance between accuracy and speed is crucial for achieving real-time perception in autonomous driving. However, many existing LiDAR detection models depend on complex feature transformations, leading to poor real-time performance and high resource consumption, which limits their practical effectiveness. In this work, we propose a faster LiDAR 3D object detector, a framework that adaptively aligns sparse voxels to enable efficient heterogeneous knowledge distillation, called FASD. We aim to distill the Transformer sequence modeling capability into Mamba models, significantly boosting accuracy through knowledge transfer. Specifically, we first design the architecture for cross-model knowledge distillation to impart the global contextual understanding capabilities of the Transformer to Mamba. Transformer-based teacher model employ a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.