Hybrid-S2S: Video Object Segmentation with Recurrent Networks and   Correspondence Matching

Fatemeh Azimi; Stanislav Frolov; Federico Raue; Joern Hees; and Andreas Dengel

arXiv:2010.05069·cs.CV·November 10, 2020·1 cites

Hybrid-S2S: Video Object Segmentation with Recurrent Networks and Correspondence Matching

Fatemeh Azimi, Stanislav Frolov, Federico Raue, Joern Hees, and Andreas Dengel

PDF

Open Access 1 Repo

TL;DR

This paper introduces HS2S, a hybrid RNN and correspondence matching architecture for one-shot video object segmentation, significantly improving accuracy and robustness over previous RNN-based methods, especially in challenging scenarios.

Contribution

The paper proposes a novel hybrid sequence-to-sequence model combining RNNs with correspondence matching to address drift and error propagation in VOS.

Findings

01

Achieves 11.2 percentage points improvement on Youtube-VOS.

02

Reduces drift and error propagation in RNN-based VOS.

03

Enhances segmentation quality in occlusion and long sequence cases.

Abstract

One-shot Video Object Segmentation~(VOS) is the task of pixel-wise tracking an object of interest within a video sequence, where the segmentation mask of the first frame is given at inference time. In recent years, Recurrent Neural Networks~(RNNs) have been widely used for VOS tasks, but they often suffer from limitations such as drift and error propagation. In this work, we study an RNN-based architecture and address some of these issues by proposing a hybrid sequence-to-sequence architecture named HS2S, utilizing a dual mask propagation strategy that allows incorporating the information obtained from correspondence matching. Our experiments show that augmenting the RNN with correspondence matching is a highly effective solution to reduce the drift problem. The additional information helps the model to predict more accurate masks and makes it robust against error propagation. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fatemehazimi990/HS2S
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods

MethodsVOS