SegGPT: Segmenting Everything In Context

Xinlong Wang; Xiaosong Zhang; Yue Cao; Wen Wang; Chunhua Shen; Tiejun; Huang

arXiv:2304.03284·cs.CV·April 7, 2023·61 cites

SegGPT: Segmenting Everything In Context

Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun, Huang

PDF

Open Access 3 Repos 2 Models

TL;DR

SegGPT is a versatile generalist model capable of performing a wide range of image and video segmentation tasks through in-context learning, unifying various data formats into a single framework.

Contribution

It introduces a novel in-context learning approach for segmentation that unifies multiple tasks and data formats into a single model, enabling flexible and diverse segmentation capabilities.

Findings

01

Strong performance on in-domain and out-of-domain segmentation tasks

02

Effective few-shot and video segmentation capabilities

03

Unified framework for various segmentation tasks

Abstract

We present SegGPT, a generalist model for segmenting everything in context. We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images. The training of SegGPT is formulated as an in-context coloring problem with random color mapping for each data sample. The objective is to accomplish diverse tasks according to the context, rather than relying on specific colors. After training, SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference, such as object instance, stuff, part, contour, and text. SegGPT is evaluated on a broad range of tasks, including few-shot semantic segmentation, video object segmentation, semantic segmentation, and panoptic segmentation. Our results show strong capabilities in segmenting in-domain and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications