Stereo Neural Vernier Caliper
Shichao Li, Zechun Liu, Zhiqiang Shen, Kwang-Ting Cheng

TL;DR
This paper introduces a novel instance-level stereo 3D object detection framework that refines initial guesses through local updates, significantly improving accuracy and flexibility over scene-centric models, and achieves state-of-the-art results on KITTI.
Contribution
The paper presents an instance-level model for stereo 3D detection that refines object guesses via local updates, enhancing scene-centric methods with a coarse-to-fine approach.
Findings
Achieves state-of-the-art performance on KITTI benchmark.
Effective in object location refinement and tracking-by-detection.
Demonstrates the benefits of instance-level modeling over scene-centric approaches.
Abstract
We propose a new object-centric framework for learning-based stereo 3D object detection. Previous studies build scene-centric representations that do not consider the significant variation among outdoor instances and thus lack the flexibility and functionalities that an instance-level model can offer. We build such an instance-level model by formulating and tackling a local update problem, i.e., how to predict a refined update given an initial 3D cuboid guess. We demonstrate how solving this problem can complement scene-centric approaches in (i) building a coarse-to-fine multi-resolution system, (ii) performing model-agnostic object location refinement, and (iii) conducting stereo 3D tracking-by-detection. Extensive experiments demonstrate the effectiveness of our approach, which achieves state-of-the-art performance on the KITTI benchmark. Code and pre-trained models are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
