RMem: Restricted Memory Banks Improve Video Object Segmentation
Junbao Zhou, Ziqi Pang, Yu-Xiong Wang

TL;DR
This paper introduces RMem, a simple approach that restricts memory bank size in video object segmentation, leading to improved accuracy and state-of-the-art results by reducing confusion from redundant information.
Contribution
The paper reveals that limiting memory banks to essential frames enhances VOS performance and introduces the concept of temporal positional embedding, improving temporal reasoning.
Findings
Restricted memory banks improve VOS accuracy.
RMem achieves state-of-the-art results on VOST and Long Videos datasets.
Restricting memory reduces training-inference discrepancy.
Abstract
With recent video object segmentation (VOS) benchmarks evolving to challenging scenarios, we revisit a simple but overlooked strategy: restricting the size of memory banks. This diverges from the prevalent practice of expanding memory banks to accommodate extensive historical information. Our specially designed "memory deciphering" study offers a pivotal insight underpinning such a strategy: expanding memory banks, while seemingly beneficial, actually increases the difficulty for VOS modules to decode relevant features due to the confusion from redundant information. By restricting memory banks to a limited number of essential frames, we achieve a notable improvement in VOS accuracy. This process balances the importance and freshness of frames to maintain an informative memory bank within a bounded capacity. Additionally, restricted memory banks reduce the training-inference discrepancy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Advanced Neural Network Applications
MethodsVOS
