AIMS: All-Inclusive Multi-Level Segmentation

Lu Qi; Jason Kuen; Weidong Guo; Jiuxiang Gu; Zhe Lin; Bo Du; Yu Xu,; Ming-Hsuan Yang

arXiv:2305.17768·cs.CV·May 30, 2023·1 cites

AIMS: All-Inclusive Multi-Level Segmentation

Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu,, Ming-Hsuan Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces AIMS, a unified multi-level image segmentation model that segments regions into parts, entities, and relations, addressing annotation inconsistency and task correlation for improved image editing applications.

Contribution

The paper proposes a novel AIMS task and a unified multi-task model that effectively segments multi-level regions, handling annotation inconsistency and task correlation.

Findings

01

Outperforms state-of-the-art methods on multiple datasets.

02

Demonstrates strong generalization across different segmentation tasks.

03

Effective in complex image editing scenarios.

Abstract

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved. In this paper, we propose a new task, All-Inclusive Multi-Level Segmentation (AIMS), which segments visual regions into three levels: part, entity, and relation (two entities with some semantic relationships). We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation. Specifically, we propose task complementarity, association, and prompt mask encoder for three-level predictions. Extensive experiments demonstrate the effectiveness and generalization capacity of our method compared to other state-of-the-art methods on a single dataset or the concurrent work on segmenting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dvlab-research/Entity
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques