Video Mask Transfiner for High-Quality Video Instance Segmentation
Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang,, Fisher Yu

TL;DR
This paper introduces the Video Mask Transfiner (VMT), a transformer-based method for high-quality, temporally consistent video instance segmentation, along with a new dataset for benchmarking detailed masks.
Contribution
The paper presents a novel VMT architecture leveraging high-resolution features and a new annotation refinement approach, along with the HQ-YTVIS dataset for better benchmarking.
Findings
VMT achieves more detailed and stable masks.
Automated annotation refinement improves training data quality.
VMT outperforms recent methods on multiple benchmarks.
Abstract
While Video Instance Segmentation (VIS) has seen rapid progress, current approaches struggle to predict high-quality masks with accurate boundary details. Moreover, the predicted segmentations often fluctuate over time, suggesting that temporal consistency cues are neglected or not fully utilized. In this paper, we set out to tackle these issues, with the aim of achieving highly detailed and more temporally stable mask predictions for VIS. We first propose the Video Mask Transfiner (VMT) method, capable of leveraging fine-grained high-resolution features thanks to a highly efficient video transformer structure. Our VMT detects and groups sparse error-prone spatio-temporal regions of each tracklet in the video segment, which are then refined using both local and instance-level cues. Second, we identify that the coarse boundary annotations of the popular YouTube-VIS dataset constitute a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Image Enhancement Techniques
MethodsTest
