Real-Time 3D Object Detection with Inference-Aligned Learning
Chenyu Zhao, Xianwei Zheng, Zimin Xia, Linwei Yue, Nan Xue

TL;DR
This paper introduces SR3D, a novel framework for real-time 3D object detection in indoor point clouds that aligns training with inference behavior, significantly improving accuracy.
Contribution
SR3D proposes a spatial-prioritized assignment and rank-aware self-distillation to bridge the training-inference gap in 3D detection models.
Findings
Outperforms prior methods in accuracy on ScanNet V2 and SUN RGB-D datasets.
Maintains real-time detection speed.
Effectively bridges the training-inference gap.
Abstract
Real-time 3D object detection from point clouds is essential for dynamic scene understanding in applications such as augmented reality, robotics and navigation. We introduce a novel Spatial-prioritized and Rank-aware 3D object detection (SR3D) framework for indoor point clouds, to bridge the gap between how detectors are trained and how they are evaluated. This gap stems from the lack of spatial reliability and ranking awareness during training, which conflicts with the ranking-based prediction selection used as inference. Such a training-inference gap hampers the model's ability to learn representations aligned with inference-time behavior. To address the limitation, SR3D consists of two components tailored to the spatial nature of point clouds during training: a novel spatial-prioritized optimal transport assignment that dynamically emphasizes well-located and spatially reliable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Neural Network Applications · Robotics and Sensor-Based Localization
