RayD3D: Distilling Depth Knowledge Along the Ray for Robust Multi-View 3D Object Detection
Rui Ding, Zhaonian Kuang, Zongwei Zhou, Meng Yang, Xinhu Zheng, Gang Hua

TL;DR
RayD3D introduces a novel ray-based distillation approach that enhances depth knowledge transfer for multi-view 3D detection, significantly improving robustness across various models and data conditions without extra inference costs.
Contribution
It proposes a new ray-based distillation method that effectively transfers depth information while reducing irrelevant data transfer, improving robustness in multi-view 3D detection.
Findings
Significant robustness improvements across three BEV-based models.
Effective depth knowledge transfer along the projection ray.
No additional inference costs introduced.
Abstract
Multi-view 3D detection with bird's eye view (BEV) is crucial for autonomous driving and robotics, but its robustness in real-world is limited as it struggles to predict accurate depth values. A mainstream solution, cross-modal distillation, transfers depth information from LiDAR to camera models but also unintentionally transfers depth-irrelevant information (e.g. LiDAR density). To mitigate this issue, we propose RayD3D, which transfers crucial depth knowledge along the ray: a line projecting from the camera to true location of an object. It is based on the fundamental imaging principle that predicted location of this object can only vary along this ray, which is finally determined by predicted depth value. Therefore, distilling along the ray enables more effective depth information transfer. More specifically, we design two ray-based distillation modules. Ray-based Contrastive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization
