Hyper-Transformer for Amodal Completion

Jianxiong Gao; Xuelin Qian; Longfei Liang; Junwei Han; Yanwei Fu

arXiv:2405.19949·cs.CV·May 31, 2024

Hyper-Transformer for Amodal Completion

Jianxiong Gao, Xuelin Qian, Longfei Liang, Junwei Han, Yanwei Fu

PDF

Open Access

TL;DR

The paper introduces H-TAN, a hyper transformer-based framework that directly learns shape priors for amodal object completion, achieving superior results on multiple benchmark datasets.

Contribution

It proposes a novel hyper transformer framework with dynamic convolution for efficient shape prior learning and amodal mask prediction, improving over traditional methods.

Findings

01

H-TAN outperforms existing methods on KINS, COCOA-cls, and D2SA datasets.

02

The hyper transformer effectively learns shape priors and enhances mask accuracy.

03

The model demonstrates robustness and stability across different benchmarks.

Abstract

Amodal object completion is a complex task that involves predicting the invisible parts of an object based on visible segments and background information. Learning shape priors is crucial for effective amodal completion, but traditional methods often rely on two-stage processes or additional information, leading to inefficiencies and potential error accumulation. To address these shortcomings, we introduce a novel framework named the Hyper-Transformer Amodal Network (H-TAN). This framework utilizes a hyper transformer equipped with a dynamic convolution head to directly learn shape priors and accurately predict amodal masks. Specifically, H-TAN uses a dual-branch structure to extract multi-scale features from both images and masks. The multi-scale features from the image branch guide the hyper transformer in learning shape priors and in generating the weights for dynamic convolution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning

MethodsConvolution