Amodal Depth Anything: Amodal Depth Estimation in the Wild

Zhenyu Li; Mykola Lavreniuk; Jian Shi; Shariq Farooq Bhat; Peter Wonka

arXiv:2412.02336·cs.CV·December 4, 2024

Amodal Depth Anything: Amodal Depth Estimation in the Wild

Zhenyu Li, Mykola Lavreniuk, Jian Shi, Shariq Farooq Bhat, Peter Wonka

PDF

Open Access

TL;DR

This paper introduces a new approach for amodal depth estimation in natural scenes, utilizing a large-scale dataset and novel models to improve generalization and accuracy in predicting occluded object depths.

Contribution

The paper presents a new formulation focusing on relative depth prediction, a large-scale dataset ADIW, and two innovative models, Amodal-DAV2 and Amodal-DepthFM, for improved amodal depth estimation.

Findings

01

Achieved 69.5% improvement over previous state-of-the-art

02

Developed a scalable pipeline for dataset creation using pre-trained models

03

Demonstrated the effectiveness of models in generating plausible occluded depth structures

Abstract

Amodal depth estimation aims to predict the depth of occluded (invisible) parts of objects in a scene. This task addresses the question of whether models can effectively perceive the geometry of occluded regions based on visible cues. Prior methods primarily rely on synthetic datasets and focus on metric depth estimation, limiting their generalization to real-world settings due to domain shifts and scalability challenges. In this paper, we propose a novel formulation of amodal depth estimation in the wild, focusing on relative depth prediction to improve model generalization across diverse natural images. We introduce a new large-scale dataset, Amodal Depth In the Wild (ADIW), created using a scalable pipeline that leverages segmentation datasets and compositing techniques. Depth maps are generated using large pre-trained depth models, and a scale-and-shift alignment strategy is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction

MethodsFocus