Next-Best-View Prediction for Active Stereo Cameras and Highly Reflective Objects
Jun Yang, Steven L. Waslander

TL;DR
This paper introduces a next-best-view framework for active stereo cameras that strategically selects viewpoints to improve depth data quality for reflective objects, addressing challenges of slow acquisition speeds and specular reflections.
Contribution
The work models specular reflections explicitly and integrates pose estimation with surface normal hypotheses to optimize viewpoint selection for depth completion.
Findings
Outperforms baseline methods in depth completion accuracy
Enhances object pose estimation performance
Effective on real-world reflective object datasets
Abstract
Depth acquisition with the active stereo camera is a challenging task for highly reflective objects. When setup permits, multi-view fusion can provide increased levels of depth completion. However, due to the slow acquisition speed of high-end active stereo cameras, collecting a large number of viewpoints for a single scene is generally not practical. In this work, we propose a next-best-view framework to strategically select camera viewpoints for completing depth data on reflective objects. In particular, we explicitly model the specular reflection of reflective surfaces based on the Phong reflection model and a photometric response function. Given the object CAD model and grayscale image, we employ an RGB-based pose estimator to obtain current pose predictions from the existing data, which is used to form predicted surface normal and depth hypotheses, and allows us to then assess the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Image Processing Techniques and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
