Anchor Diffusion for Unsupervised Video Object Segmentation
Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip, H.S. Torr

TL;DR
This paper introduces Anchor Diffusion, a simple method for unsupervised video object segmentation that models long-term dependencies via dense pixel correspondence to improve accuracy and consistency over time.
Contribution
The paper proposes a novel anchor-based diffusion technique to model long-term pixel dependencies without online supervision, outperforming existing methods on key benchmarks.
Findings
Achieves 81.7% mean IoU on DAVIS-2016, ranking first among unsupervised methods.
Outperforms traditional recurrent and optical flow-based approaches.
Demonstrates competitive results on FBMS and ViSal datasets.
Abstract
Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow. Despite their complexity, these kinds of approaches tend to favour short-term temporal dependencies and are thus prone to accumulating inaccuracies, which cause drift over time. Moreover, simple (static) image segmentation models, alone, can perform competitively against these methods, which further suggests that the way temporal dependencies are modelled should be reconsidered. Motivated by these observations, in this paper we explore simple yet effective strategies to model long-term temporal dependencies. Inspired by the non-local operators of [70], we introduce a technique to establish dense correspondences between pixel embeddings of a reference "anchor" frame and the current one. This allows the learning of pairwise dependencies at arbitrarily long…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
