Scale Propagation Network for Generalizable Depth Completion

Haotian Wang; Meng Yang; Xinhu Zheng; and Gang Hua

arXiv:2410.18408·cs.CV·October 25, 2024

Scale Propagation Network for Generalizable Depth Completion

Haotian Wang, Meng Yang, Xinhu Zheng, and Gang Hua

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel scale propagation normalization (SP-Norm) for depth completion, enabling models to generalize better across unseen scenes by propagating scale information and leveraging a new architecture based on ConvNeXt V2.

Contribution

The authors propose SP-Norm to improve depth completion generalization and develop a new architecture with ConvNeXt V2, achieving superior performance on diverse unseen datasets.

Findings

01

Outperforms state-of-the-art methods in accuracy

02

Faster inference with lower memory usage

03

Effective across various sparse depth map types

Abstract

Depth completion, inferring dense depth maps from sparse measurements, is crucial for robust 3D perception. Although deep learning based methods have made tremendous progress in this problem, these models cannot generalize well across different scenes that are unobserved in training, posing a fundamental limitation that yet to be overcome. A careful analysis of existing deep neural network architectures for depth completion, which are largely borrowing from successful backbones for image analysis tasks, reveals that a key design bottleneck actually resides in the conventional normalization layers. These normalization layers are designed, on one hand, to make training more stable, on the other hand, to build more visual invariance across scene scales. However, in depth completion, the scale is actually what we want to robustly estimate in order to better generalize to unseen scenes. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Wang-xjtu/SPNet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction · Computer Graphics and Visualization Techniques

MethodsConvNeXt · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings