Believing is Seeing: Unobserved Object Detection using Generative Models

Subhransu S. Bhattacharjee; Dylan Campbell; Rahul Shome

arXiv:2410.05869·cs.CV·March 25, 2025

Believing is Seeing: Unobserved Object Detection using Generative Models

Subhransu S. Bhattacharjee, Dylan Campbell, Rahul Shome

PDF

Open Access 1 Repo

TL;DR

This paper explores the novel task of detecting objects that are nearby but not visible in images, using adapted generative models to infer their presence and location, with promising results on indoor scene datasets.

Contribution

It introduces unobserved object detection as a new task and adapts state-of-the-art generative models to address it, providing a benchmark and evaluation metrics.

Findings

01

Generative models can infer unobserved objects in images.

02

Proposed metrics effectively evaluate unobserved object detection.

03

Empirical results show potential for generative models in this task.

Abstract

Can objects that are not visible in an image -- but are in the vicinity of the camera -- be detected? This study introduces the novel tasks of 2D, 2.5D and 3D unobserved object detection for predicting the location of nearby objects that are occluded or lie outside the image frame. We adapt several state-of-the-art pre-trained generative models to address this task, including 2D and 3D diffusion models and vision-language models, and show that they can be used to infer the presence of objects that are not directly observed. To benchmark this task, we propose a suite of metrics that capture different aspects of performance. Our empirical evaluation on indoor scenes from the RealEstate10k and NYU Depth v2 datasets demonstrate results that motivate the use of generative models for the unobserved object detection task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

1ssb/UOD
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Robotic Path Planning Algorithms

MethodsDiffusion