OMG-Seg: Is One Model Good Enough For All Segmentation?

Xiangtai Li; Haobo Yuan; Wei Li; Henghui Ding; Size Wu; Wenwei Zhang,; Yining Li; Kai Chen; Chen Change Loy

arXiv:2401.10229·cs.CV·October 2, 2024·1 cites

OMG-Seg: Is One Model Good Enough For All Segmentation?

Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang,, Yining Li, Kai Chen, Chen Change Loy

PDF

Open Access 1 Repo

TL;DR

OMG-Seg is a unified transformer-based model capable of handling a wide range of segmentation tasks, including image, video, open vocabulary, and interactive segmentation, with reduced computational costs.

Contribution

This work introduces OMG-Seg, the first unified model that efficiently addresses multiple segmentation tasks with a single architecture.

Findings

01

Supports over ten segmentation tasks simultaneously

02

Reduces computational and parameter overhead

03

Achieves satisfactory performance across diverse tasks

Abstract

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models. We propose OMG-Seg, One Model that is Good enough to efficiently and effectively handle all the segmentation tasks, including image semantic, instance, and panoptic segmentation, as well as their video counterparts, open vocabulary settings, prompt-driven, interactive segmentation like SAM, and video object segmentation. To our knowledge, this is the first model to handle all these tasks in one model and achieve satisfactory performance. We show that OMG-Seg, a transformer-based encoder-decoder architecture with task-specific queries and outputs, can support over ten distinct segmentation tasks and yet significantly reduce computational and parameter overhead across various tasks and datasets. We rigorously evaluate the inter-task influences and correlations during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lxtgh/omg-seg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsSegment Anything Model