Pixel-Level Bijective Matching for Video Object Segmentation

Suhwan Cho; Heansung Lee; Minjung Kim; Sungjun Jang; Sangyoun Lee

arXiv:2110.01644·cs.CV·November 15, 2021

Pixel-Level Bijective Matching for Video Object Segmentation

Suhwan Cho, Heansung Lee, Minjung Kim, Sungjun Jang, Sangyoun Lee

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a bijective pixel-level matching mechanism for video object segmentation that reduces background distractors by ensuring mutual best matches, along with a mask embedding module to enhance mask propagation.

Contribution

The novel bijective matching approach enforces strict mutual correspondence, improving accuracy in challenging scenarios, and the mask embedding module captures target position information more effectively.

Findings

01

Reduces background distractors in VOS tasks.

02

Improves mask propagation accuracy with historic mask embedding.

03

Demonstrates superior performance over existing methods.

Abstract

Semi-supervised video object segmentation (VOS) aims to track the designated objects present in the initial frame of a video at the pixel level. To fully exploit the appearance information of an object, pixel-level feature matching is widely used in VOS. Conventional feature matching runs in a surjective manner, i.e., only the best matches from the query frame to the reference frame are considered. Each location in the query frame refers to the optimal location in the reference frame regardless of how often each reference frame location is referenced. This works well in most cases and is robust against rapid appearance variations, but may cause critical errors when the query frame contains background distractors that look similar to the target object. To mitigate this concern, we introduce a bijective matching mechanism to find the best matches from the query frame to the reference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

suhwan-cho/bmvos
pytorchOfficial

Videos

Pixel-Level Bijective Matching for Video Object Segmentation· youtube

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Advanced Neural Network Applications

MethodsVOS