Global Spectral Filter Memory Network for Video Object Segmentation
Yong Liu, Ran Yu, Jiahao Wang, Xinyuan Zhao, Yitong Wang, Yansong, Tang, Yujiu Yang

TL;DR
This paper introduces GSFM, a novel spectral domain approach for video object segmentation that enhances intra-frame spatial dependencies, leading to improved accuracy and state-of-the-art performance on key benchmarks.
Contribution
The paper proposes the Global Spectral Filter Memory network (GSFM), which leverages spectral domain learning to improve intra-frame spatial interaction in video segmentation.
Findings
GSFM outperforms baseline methods on DAVIS and YouTube-VOS benchmarks.
Spectral domain learning enhances intra-frame spatial dependencies.
Low and high frequency modules improve encoder and decoder performance respectively.
Abstract
This paper studies semi-supervised video object segmentation through boosting intra-frame interaction. Recent memory network-based methods focus on exploiting inter-frame temporal reference while paying little attention to intra-frame spatial dependency. Specifically, these segmentation model tends to be susceptible to interference from unrelated nontarget objects in a certain frame. To this end, we propose Global Spectral Filter Memory network (GSFM), which improves intra-frame interaction through learning long-term spatial dependencies in the spectral domain. The key components of GSFM is 2D (inverse) discrete Fourier transform for spatial information mixing. Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head). We attribute this to semantic information extracting role for encoder and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Video Surveillance and Tracking Methods · Image Enhancement Techniques
MethodsMemory Network
