Consistent Video Instance Segmentation with Inter-Frame Recurrent   Attention

Quanzeng You; Jiang Wang; Peng Chu; Andre Abrantes; Zicheng Liu

arXiv:2206.07011·cs.CV·June 15, 2022·1 cites

Consistent Video Instance Segmentation with Inter-Frame Recurrent Attention

Quanzeng You, Jiang Wang, Peng Chu, Andre Abrantes, Zicheng Liu

PDF

Open Access

TL;DR

This paper introduces an end-to-end video instance segmentation framework with Inter-Frame Recurrent Attention, improving temporal consistency of instance masks while maintaining high segmentation quality, achieving state-of-the-art results.

Contribution

It proposes a novel Inter-Frame Recurrent Attention mechanism to explicitly model temporal instance consistency in video segmentation.

Findings

01

Significantly improves temporal instance consistency.

02

Achieves state-of-the-art accuracy on YouTubeVIS datasets.

03

Maintains high-quality segmentation masks.

Abstract

Video instance segmentation aims at predicting object segmentation masks for each frame, as well as associating the instances across multiple frames. Recent end-to-end video instance segmentation methods are capable of performing object segmentation and instance association together in a direct parallel sequence decoding/prediction framework. Although these methods generally predict higher quality object segmentation masks, they can fail to associate instances in challenging cases because they do not explicitly model the temporal instance consistency for adjacent frames. We propose a consistent end-to-end video instance segmentation framework with Inter-Frame Recurrent Attention to model both the temporal instance consistency for adjacent frames and the global temporal context. Our extensive experiments demonstrate that the Inter-Frame Recurrent Attention significantly improves temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Visual Attention and Saliency Detection · Multimodal Machine Learning Applications