SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization
Zhihui Lin, Tianyu Yang, Maomao Li, Ziyu Wang, Chun Yuan, Wenhao, Jiang, and Wei Liu

TL;DR
SWEM introduces a sequential weighted EM approach to reduce memory redundancy and improve efficiency in real-time semi-supervised video object segmentation, achieving high speed and accuracy.
Contribution
The paper proposes SWEM, a novel method that merges intra- and inter-frame features using weighted EM, maintaining fixed memory size for efficient and accurate VOS.
Findings
Achieves 36 FPS in VOS tasks.
Attains 84.3% J&F score on DAVIS 2017.
Reduces memory redundancy while maintaining high performance.
Abstract
Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS). However, continuously growing and redundant template features lead to an inefficient inference. To alleviate this, we propose a novel Sequential Weighted Expectation-Maximization (SWEM) network to greatly reduce the redundancy of memory features. Different from the previous methods which only detect feature redundancy between frames, SWEM merges both intra-frame and inter-frame similar features by leveraging the sequential weighted EM algorithm. Further, adaptive weights for frame features endow SWEM with the flexibility to represent hard samples, improving the discrimination of templates. Besides, the proposed method maintains a fixed number of template features in memory, which ensures the stable inference complexity of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Visual Attention and Saliency Detection
MethodsVOS
