Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Xinjing Cheng, Peng Wang, and Ruigang Yang

TL;DR
This paper introduces a convolutional spatial propagation network (CSPN) that learns pixel affinities to refine and densify depth maps from single images, significantly improving accuracy and speed over previous methods.
Contribution
The paper proposes a novel CSPN that efficiently learns pixel affinities for depth estimation, enhancing existing methods and enabling sparse-to-dense depth conversion with superior performance.
Findings
30% reduction in depth error on benchmarks
2 to 5 times faster than prior SOTA methods
Effective refinement and densification of depth maps
Abstract
Depth estimation from a single image is a fundamental problem in computer vision. In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for depth prediction. Specifically, we adopt an efficient linear propagation model, where the propagation is performed with a manner of recurrent convolutional operation, and the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). We apply the designed CSPN to two depth estimation tasks given a single image: (1) To refine the depth output from state-of-the-art (SOTA) existing methods; and (2) to convert sparse depth samples to a dense depth map by embedding the depth samples within the propagation procedure. The second task is inspired by the availability of LIDARs that provides sparse but accurate depth measurements. We experimented the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
