QD-BEV : Quantization-aware View-guided Distillation for Multi-view 3D Object Detection
Yifan Zhang, Zhen Dong, Huanrui Yang, Ming Lu, Cheng-Ching Tseng, Yuan, Du, Kurt Keutzer, Li Du, Shanghang Zhang

TL;DR
This paper introduces QD-BEV, a quantization-aware view-guided distillation method for multi-view 3D object detection that stabilizes training and maintains high accuracy with significantly compressed models.
Contribution
The paper proposes a novel view-guided distillation objective to improve quantization-aware training for BEV-based 3D detection models, enabling efficient deployment.
Findings
Achieves comparable or better accuracy than prior methods with high model compression.
On nuScenes, the 4-bit weight and 6-bit activation QD-BEV-Tiny reaches 37.2% NDS, outperforming BevFormer-Tiny.
Models demonstrate excellent performance across different sizes with substantial efficiency gains.
Abstract
Multi-view 3D detection based on BEV (bird-eye-view) has recently achieved significant improvements. However, the huge memory consumption of state-of-the-art models makes it hard to deploy them on vehicles, and the non-trivial latency will affect the real-time perception of streaming applications. Despite the wide application of quantization to lighten models, we show in our paper that directly applying quantization in BEV tasks will 1) make the training unstable, and 2) lead to intolerable performance degradation. To solve these issues, our method QD-BEV enables a novel view-guided distillation (VGD) objective, which can stabilize the quantization-aware training (QAT) while enhancing the model performance by leveraging both image features and BEV features. Our experiments show that QD-BEV achieves similar or even better accuracy than previous methods with significant efficiency gains.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection
MethodsBalanced Selection
