Decoupling Features in Hierarchical Propagation for Video Object Segmentation
Zongxin Yang, Yi Yang

TL;DR
This paper introduces DeAOT, a novel hierarchical propagation method that decouples object-agnostic and object-specific features, significantly improving accuracy and efficiency in semi-supervised video object segmentation.
Contribution
DeAOT decouples feature propagation into two independent branches and proposes an efficient Gated Propagation Module, advancing state-of-the-art in video object segmentation.
Findings
DeAOT outperforms AOT in accuracy and speed.
Achieves state-of-the-art results on multiple benchmarks.
Maintains high performance without test-time augmentation.
Abstract
This paper focuses on developing a more effective method of hierarchical propagation for semi-supervised Video Object Segmentation (VOS). Based on vision transformers, the recently-developed Associating Objects with Transformers (AOT) approach introduces hierarchical propagation into VOS and has shown promising results. The hierarchical propagation can gradually propagate information from past frames to the current frame and transfer the current frame feature from object-agnostic to object-specific. However, the increase of object-specific information will inevitably lead to the loss of object-agnostic visual information in deep propagation layers. To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach. Firstly, DeAOT decouples the hierarchical propagation of object-agnostic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
MethodsVOS
