OPDMulti: Openable Part Detection for Multiple Objects

Xiaohao Sun; Hanxiao Jiang; Manolis Savva; Angel Xuan Chang

arXiv:2303.14087·cs.CV·March 27, 2023·1 cites

OPDMulti: Openable Part Detection for Multiple Objects

Xiaohao Sun, Hanxiao Jiang, Manolis Savva, Angel Xuan Chang

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces OPDFormer, a transformer-based approach for detecting openable parts in images with multiple objects, advancing beyond single-object assumptions and providing a new dataset for real-world scenes.

Contribution

The work extends openable part detection to multi-object scenes, introduces a new dataset, and proposes a novel transformer architecture that outperforms previous methods.

Findings

01

OPDFormer significantly outperforms prior methods.

02

Multi-object scenarios remain challenging for current approaches.

03

A new dataset based on real-world scenes was created.

Abstract

Openable part detection is the task of detecting the openable parts of an object in a single-view image, and predicting corresponding motion parameters. Prior work investigated the unrealistic setting where all input images only contain a single openable object. We generalize this task to scenes with multiple objects each potentially possessing openable parts, and create a corresponding dataset based on real-world scenes. We then address this more challenging scenario with OPDFormer: a part-aware transformer architecture. Our experiments show that the OPDFormer architecture significantly outperforms prior work. The more realistic multiple-object scenarios we investigated remain challenging for all methods, indicating opportunities for future work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

3dlg-hcvc/OPDMulti
pytorchOfficial

Models

🤗
3dlg-hcvc/opdmulti-motion-state-rgb-model
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Human Pose and Action Recognition