High-Quality Entity Segmentation

Lu Qi; Jason Kuen; Weidong Guo; Tiancheng Shen; Jiuxiang Gu; Jiaya; Jia; Zhe Lin; Ming-Hsuan Yang

arXiv:2211.05776·cs.CV·April 4, 2023

High-Quality Entity Segmentation

Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya, Jia, Zhe Lin, Ming-Hsuan Yang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new high-quality entity segmentation dataset and a novel query-based Transformer method, CropFormer, that effectively fuses multi-view image information to improve dense segmentation accuracy in diverse, high-resolution images.

Contribution

The paper presents a new dataset focused on high-quality dense segmentation in the wild and a novel Transformer architecture, CropFormer, for improved multi-view mask fusion in high-resolution images.

Findings

01

CropFormer achieves a 1.9 AP improvement on entity segmentation.

02

The dataset enables better generalization across diverse domains.

03

CropFormer enhances traditional segmentation tasks.

Abstract

Dense image segmentation tasks e.g., semantic, panoptic) are useful for image editing, but existing methods can hardly generalize well in an in-the-wild setting where there are unrestricted image domains, classes, and image resolution and quality variations. Motivated by these observations, we construct a new entity segmentation dataset, with a strong focus on high-quality dense segmentation in the wild. The dataset contains images spanning diverse image domains and entities, along with plentiful high-resolution images and high-quality mask annotations for training and testing. Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images. It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qqlu/Entity/tree/main/Entityv2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Position-Wise Feed-Forward Layer · Linear Layer · Label Smoothing · Softmax · Adam · Absolute Position Encodings · Layer Normalization