Segment Anything Meets Point Tracking

Frano Raji\v{c}; Lei Ke; Yu-Wing Tai; Chi-Keung Tang; Martin; Danelljan; Fisher Yu

arXiv:2307.01197·cs.CV·December 5, 2023·38 cites

Segment Anything Meets Point Tracking

Frano Raji\v{c}, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin, Danelljan, Fisher Yu

PDF

Open Access 1 Repo

TL;DR

SAM-PT introduces a novel point-centric approach for interactive video segmentation, leveraging long-term point tracking with SAM to improve zero-shot performance and interaction efficiency across multiple benchmarks.

Contribution

The paper proposes SAM-PT, a new method that uses point propagation for video segmentation, exploiting local structure information independently of object semantics.

Findings

01

Outperforms traditional mask propagation methods on multiple benchmarks.

02

Achieves better zero-shot performance in open-world video object segmentation.

03

Provides an efficient point-based tracking framework with publicly available code.

Abstract

The Segment Anything Model (SAM) has established itself as a powerful zero-shot image segmentation model, enabled by efficient point-centric annotation and prompt-based models. While click and brush interactions are both well explored in interactive image segmentation, the existing methods on videos focus on mask annotation and propagation. This paper presents SAM-PT, a novel method for point-centric interactive video segmentation, empowered by SAM and long-term point tracking. SAM-PT leverages robust and sparse point selection and propagation techniques for mask generation. Compared to traditional object-centric mask propagation strategies, we uniquely use point propagation to exploit local structure information agnostic to object semantics. We highlight the merits of point-based tracking through direct evaluation on the zero-shot open-world Unidentified Video Objects (UVO) benchmark.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

syscv/sam-pt
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Advanced Image and Video Retrieval Techniques

MethodsSegment Anything Model · Focus