Universal Segmentation at Arbitrary Granularity with Language   Instruction

Yong Liu; Cairong Zhang; Yitong Wang; Jiahao Wang; Yujiu Yang; Yansong; Tang

arXiv:2312.01623·cs.CV·November 27, 2024·2 cites

Universal Segmentation at Arbitrary Granularity with Language Instruction

Yong Liu, Cairong Zhang, Yitong Wang, Jiahao Wang, Yujiu Yang, Yansong, Tang

PDF

Open Access 2 Repos

TL;DR

This paper introduces UniLSeg, a universal segmentation model guided by language instructions, capable of segmenting at any semantic level across diverse tasks without task-specific retraining.

Contribution

The paper presents UniLSeg, a versatile segmentation model trained on a unified data format, enabling accurate arbitrary granularity segmentation guided by language instructions.

Findings

01

Outperforms specialist segmentation models on various tasks

02

Utilizes an automatic annotation engine for unlabeled data

03

Achieves high accuracy across diverse segmentation scenarios

Abstract

This paper aims to achieve universal segmentation of arbitrary semantic level. Despite significant progress in recent years, specialist segmentation approaches are limited to specific tasks and data distribution. Retraining a new model for adaptation to new scenarios or settings takes expensive computation and time cost, which raises the demand for versatile and universal segmentation model that can cater to various granularity. Although some attempts have been made for unifying different segmentation tasks or generalization to various scenarios, limitations in the definition of paradigms and input-output spaces make it difficult for them to achieve accurate understanding of content at arbitrary granularity. To this end, we present UniLSeg, a universal segmentation model that can perform segmentation at any semantic level with the guidance of language instructions. For training UniLSeg,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning