SPADE: Sparsity Adaptive Depth Estimator for Zero-Shot, Real-Time, Monocular Depth Estimation in Underwater Environments

Hongjie Zhang; Gideon Billings; Stefan B. Williams

arXiv:2510.25463·cs.CV·October 30, 2025

SPADE: Sparsity Adaptive Depth Estimator for Zero-Shot, Real-Time, Monocular Depth Estimation in Underwater Environments

Hongjie Zhang, Gideon Billings, Stefan B. Williams

PDF

TL;DR

SPADE is a novel depth estimation method for underwater vehicles that combines sparse depth priors with a transformer-based refinement, achieving real-time, accurate, and generalizable dense depth maps for autonomous underwater inspection.

Contribution

It introduces a two-stage depth estimation pipeline with a sparsity adaptive approach and cascade transformer blocks, improving accuracy and efficiency over existing methods.

Findings

01

Achieves over 15 FPS on embedded hardware.

02

Outperforms state-of-the-art baselines in accuracy and generalization.

03

Provides dense, metric scale depth maps for underwater environments.

Abstract

Underwater infrastructure requires frequent inspection and maintenance due to harsh marine conditions. Current reliance on human divers or remotely operated vehicles is limited by perceptual and operational challenges, especially around complex structures or in turbid water. Enhancing the spatial awareness of underwater vehicles is key to reducing piloting risks and enabling greater autonomy. To address these challenges, we present SPADE: SParsity Adaptive Depth Estimator, a monocular depth estimation pipeline that combines pre-trained relative depth estimator with sparse depth priors to produce dense, metric scale depth maps. Our two-stage approach first scales the relative depth map with the sparse depth points, then refines the final metric prediction with our proposed Cascade Conv-Deformable Transformer blocks. Our approach achieves improved accuracy and generalisation over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.