Joint Inductive and Transductive Learning for Video Object Segmentation

Yunyao Mao; Ning Wang; Wengang Zhou; Houqiang Li

arXiv:2108.03679·cs.CV·August 10, 2021

Joint Inductive and Transductive Learning for Video Object Segmentation

Yunyao Mao, Ning Wang, Wengang Zhou, Houqiang Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a unified framework combining transductive and inductive learning for semi-supervised video object segmentation, leveraging their complementarity to improve accuracy and robustness without synthetic data.

Contribution

It proposes a novel integrated approach with two branches and a label encoder, enhancing spatio-temporal reasoning and discriminative target learning for video segmentation.

Findings

01

Achieves new state-of-the-art results on multiple benchmarks.

02

Does not require synthetic training data.

03

Effectively combines transductive and inductive methods.

Abstract

Semi-supervised video object segmentation is a task of segmenting the target object in a video sequence given only a mask annotation in the first frame. The limited information available makes it an extremely challenging task. Most previous best-performing methods adopt matching-based transductive reasoning or online inductive learning. Nevertheless, they are either less discriminative for similar instances or insufficient in the utilization of spatio-temporal information. In this work, we propose to integrate transductive and inductive learning into a unified framework to exploit the complementarity between them for accurate and robust video object segmentation. The proposed approach consists of two functional branches. The transduction branch adopts a lightweight transformer architecture to aggregate rich spatio-temporal cues while the induction branch performs online inductive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maoyunyao/joint
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications