Adaptive Memory Management for Video Object Segmentation

Ali Pourganjalikhan; Charalambos Poullis

arXiv:2204.06626·cs.CV·April 15, 2022

Adaptive Memory Management for Video Object Segmentation

Ali Pourganjalikhan, Charalambos Poullis

PDF

Open Access 1 Repo

TL;DR

This paper introduces an adaptive memory management strategy for video object segmentation that maintains high accuracy while significantly improving inference speed by discarding obsolete features based on their importance.

Contribution

It proposes a novel adaptive memory bank approach that dynamically manages stored features, enabling efficient segmentation of videos of arbitrary length without sacrificing performance.

Findings

01

Outperforms fixed-sized memory strategies on DAVIS and Youtube-VOS datasets.

02

Increases inference speed by up to 80%.

03

Achieves comparable accuracy to growing memory banks.

Abstract

Matching-based networks have achieved state-of-the-art performance for video object segmentation (VOS) tasks by storing every-k frames in an external memory bank for future inference. Storing the intermediate frames' predictions provides the network with richer cues for segmenting an object in the current frame. However, the size of the memory bank gradually increases with the length of the video, which slows down inference speed and makes it impractical to handle arbitrary length videos. This paper proposes an adaptive memory bank strategy for matching-based networks for semi-supervised video object segmentation (VOS) that can handle videos of arbitrary length by discarding obsolete features. Features are indexed based on their importance in the segmentation of the objects in previous frames. Based on the index, we discard unimportant features to accommodate new features. We present…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alipga/AMM_VOS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings