SAM3-DMS: Decoupled Memory Selection for Multi-target Video Segmentation of SAM3

Ruiqi Shen; Chang Liu; Henghui Ding

arXiv:2601.09699·cs.CV·January 15, 2026

SAM3-DMS: Decoupled Memory Selection for Multi-target Video Segmentation of SAM3

Ruiqi Shen, Chang Liu, Henghui Ding

PDF

Open Access

TL;DR

SAM3-DMS introduces a decoupled, fine-grained memory selection method for multi-target video segmentation, significantly improving identity preservation and tracking stability, especially in complex scenes with many objects.

Contribution

It proposes a training-free, decoupled memory selection strategy for SAM3, enhancing multi-object segmentation performance in complex scenarios.

Findings

01

Improves identity preservation in multi-target segmentation.

02

Achieves more stable tracking with increased target density.

03

Outperforms original SAM3 in complex multi-object scenes.

Abstract

Segment Anything 3 (SAM3) has established a powerful foundation that robustly detects, segments, and tracks specified targets in videos. However, in its original implementation, its group-level collective memory selection is suboptimal for complex multi-object scenarios, as it employs a synchronized decision across all concurrent targets conditioned on their average performance, often overlooking individual reliability. To this end, we propose SAM3-DMS, a training-free decoupled strategy that utilizes fine-grained memory selection on individual objects. Experiments demonstrate that our approach achieves robust identity preservation and tracking stability. Notably, our advantage becomes more pronounced with increased target density, establishing a solid foundation for simultaneous multi-target video segmentation in the wild.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Visual Attention and Saliency Detection · Advanced Neural Network Applications