UMono: Physical Model Informed Hybrid CNN-Transformer Framework for Underwater Monocular Depth Estimation
Jian Wang, Jing Wang, Shenghui Rong, Bo He

TL;DR
UMono is a novel hybrid CNN-Transformer framework that integrates underwater image formation models to improve monocular depth estimation accuracy and generalization in underwater environments.
Contribution
The paper introduces UMono, a new end-to-end learning framework that incorporates physical underwater models and combines local and global features for better depth estimation.
Findings
UMono outperforms existing methods in quantitative metrics.
UMono demonstrates superior qualitative depth estimation results.
Incorporating physical models enhances underwater depth estimation accuracy.
Abstract
Underwater monocular depth estimation serves as the foundation for tasks such as 3D reconstruction of underwater scenes. However, due to the influence of light and medium, the underwater environment undergoes a distinctive imaging process, which presents challenges in accurately estimating depth from a single image. The existing methods fail to consider the unique characteristics of underwater environments, leading to inadequate estimation results and limited generalization performance. Furthermore, underwater depth estimation requires extracting and fusing both local and global features, which is not fully explored in existing methods. In this paper, an end-to-end learning framework for underwater monocular depth estimation called UMono is presented, which incorporates underwater image formation model characteristics into network architecture, and effectively utilize both local and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Advanced Vision and Imaging · Optical measurement and interference techniques
