Suppress-and-Refine Framework for End-to-End 3D Object Detection
Zili Liu, Guodong Xu, Honghui Yang, Minghao Chen, Kuoliang Wu, Zheng, Yang, Haifeng Liu, Deng Cai

TL;DR
This paper introduces SRDet, a fully end-to-end 3D object detection framework that eliminates handcrafted components, utilizes feature points and proposals efficiently, and achieves state-of-the-art accuracy with real-time speed.
Contribution
It proposes the first fully end-to-end 3D detector, SRDet, which improves detection accuracy and speed by removing handcrafted components and leveraging feature points and proposals.
Findings
Achieves state-of-the-art performance on ScanNetV2 and SUN RGB-D datasets.
Operates at the fastest speed among 3D detectors.
Produces high-quality predictions with low computational cost.
Abstract
3D object detector based on Hough voting achieves great success and derives many follow-up works. Despite constantly refreshing the detection accuracy, these works suffer from handcrafted components used to eliminate redundant boxes, and thus are non-end-to-end and time-consuming. In this work, we propose a suppress-and-refine framework to remove these handcrafted components. To fully utilize full-resolution information and achieve real-time speed, it directly consumes feature points and redundant 3D proposals. Specifically, it first suppresses noisy 3D feature points and then feeds them to 3D proposals for the following RoI-aware refinement. With the gating mechanism to build fine proposal features and the self-attention mechanism to model relationships, our method can produce high-quality predictions with a small computation budget in an end-to-end manner. To this end, we present the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
