Pixel-Level Bijective Matching for Video Object Segmentation
Suhwan Cho, Heansung Lee, Minjung Kim, Sungjun Jang, Sangyoun Lee

TL;DR
This paper introduces a bijective pixel-level matching mechanism for video object segmentation that reduces background distractors by ensuring mutual best matches, along with a mask embedding module to enhance mask propagation.
Contribution
The novel bijective matching approach enforces strict mutual correspondence, improving accuracy in challenging scenarios, and the mask embedding module captures target position information more effectively.
Findings
Reduces background distractors in VOS tasks.
Improves mask propagation accuracy with historic mask embedding.
Demonstrates superior performance over existing methods.
Abstract
Semi-supervised video object segmentation (VOS) aims to track the designated objects present in the initial frame of a video at the pixel level. To fully exploit the appearance information of an object, pixel-level feature matching is widely used in VOS. Conventional feature matching runs in a surjective manner, i.e., only the best matches from the query frame to the reference frame are considered. Each location in the query frame refers to the optimal location in the reference frame regardless of how often each reference frame location is referenced. This works well in most cases and is robust against rapid appearance variations, but may cause critical errors when the query frame contains background distractors that look similar to the target object. To mitigate this concern, we introduce a bijective matching mechanism to find the best matches from the query frame to the reference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Pixel-Level Bijective Matching for Video Object Segmentation· youtube
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Advanced Neural Network Applications
MethodsVOS
