Holistic Prototype Attention Network for Few-Shot VOS
Yin Tang, Tao Chen, Xiruo Jiang, Yazhou Yao, Guo-Sen Xie, and Heng-Tao, Shen

TL;DR
This paper introduces HPAN, a novel network for few-shot video object segmentation that leverages prototype graph and bidirectional attention modules to improve segmentation accuracy by capturing inter-frame and inter-class correlations.
Contribution
The paper proposes a holistic prototype attention network with PGAM and BPAM modules, enhancing feature representation and inter-frame consistency for FSVOS.
Findings
HPAN outperforms existing methods on YouTube-FSVOS dataset.
The prototype graph attention module improves local prototype representations.
Bidirectional attention module enhances support-query and intra-frame consistency.
Abstract
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting to a small set of support images that contain pixel-level object annotations. Existing methods have demonstrated that the domain agent-based attention mechanism is effective in FSVOS by learning the correlation between support images and query frames. However, the agent frame contains redundant pixel information and background noise, resulting in inferior segmentation performance. Moreover, existing methods tend to ignore inter-frame correlations in query videos. To alleviate the above dilemma, we propose a holistic prototype attention network (HPAN) for advancing FSVOS. Specifically, HPAN introduces a prototype graph attention module (PGAM) and a bidirectional prototype attention module (BPAM), transferring informative knowledge from seen to unseen classes. PGAM generates local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection · Advanced Neural Network Applications
