The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement
Ilya Chugunov, Yuxuan Zhang, Zhihao Xia, Xuaner (Cecilia) Zhang,, Jiawen Chen, and Felix Heide

TL;DR
This paper presents a method that uses natural hand shake captured during smartphone photography to refine low-resolution LiDAR depth maps into high-fidelity, high-resolution depth estimates without extra hardware.
Contribution
It introduces a test-time optimization approach using a coordinate MLP to combine micro-baseline parallax cues with LiDAR depth for improved depth estimation.
Findings
Achieves high-resolution depth maps from simple smartphone captures.
No additional hardware or user interaction needed beyond pressing a button.
Enhances depth accuracy for close-range tabletop photography.
Abstract
Modern smartphones can continuously stream multi-megapixel RGB images at 60Hz, synchronized with high-quality 3D pose information and low-resolution LiDAR-driven depth estimates. During a snapshot photograph, the natural unsteadiness of the photographer's hands offers millimeter-scale variation in camera pose, which we can capture along with RGB and depth in a circular buffer. In this work we explore how, from a bundle of these measurements acquired during viewfinding, we can combine dense micro-baseline parallax cues with kilopixel LiDAR depth to distill a high-fidelity depth map. We take a test-time optimization approach and train a coordinate MLP to output photometrically and geometrically consistent depth estimates at the continuous coordinates along the path traced by the photographer's natural hand shake. With no additional hardware, artificial hand motion, or user interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Robotics and Sensor-Based Localization
