General and Task-Oriented Video Segmentation

Mu Chen; Liulei Li; Wenguan Wang; Ruijie Quan; Yi Yang

arXiv:2407.06540·cs.CV·July 10, 2024

General and Task-Oriented Video Segmentation

Mu Chen, Liulei Li, Wenguan Wang, Ruijie Quan, Yi Yang

PDF

Open Access 1 Repo

TL;DR

GvSeg is a versatile video segmentation framework that unifies multiple tasks through disentangled modeling and task-specific strategies, achieving superior performance across diverse benchmarks.

Contribution

It introduces a general, architecture-agnostic framework that models segmentation targets from appearance, position, and shape, and adapts query strategies for different tasks.

Findings

01

Outperforms existing solutions on four video segmentation tasks

02

Effective across seven benchmark datasets

03

Successfully unifies multiple segmentation tasks in a single framework

Abstract

We present GvSeg, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GvSeg: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kagawa588/gvseg
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques