When SAM2 Meets Video Shadow and Mirror Detection

Leiping Jie

arXiv:2412.19293·cs.CV·December 30, 2024

When SAM2 Meets Video Shadow and Mirror Detection

Leiping Jie

PDF

Open Access 1 Repo

TL;DR

This paper evaluates SAM2's performance on video shadow and mirror detection tasks, revealing its limitations in segmenting rare objects and highlighting areas for future improvement.

Contribution

It is the first to assess SAM2 on video shadow and mirror detection, exposing its current shortcomings in these specialized segmentation tasks.

Findings

01

SAM2 performs suboptimally on video shadow and mirror detection tasks.

02

Point prompts lead to lower performance compared to mask prompts.

03

The study provides insights into SAM2's limitations in rare object segmentation.

Abstract

As the successor to the Segment Anything Model (SAM), the Segment Anything Model 2 (SAM2) not only improves performance in image segmentation but also extends its capabilities to video segmentation. However, its effectiveness in segmenting rare objects that seldom appear in videos remains underexplored. In this study, we evaluate SAM2 on three distinct video segmentation tasks: Video Shadow Detection (VSD) and Video Mirror Detection (VMD). Specifically, we use ground truth point or mask prompts to initialize the first frame and then predict corresponding masks for subsequent frames. Experimental results show that SAM2's performance on these tasks is suboptimal, especially when point prompts are used, both quantitatively and qualitatively. Code is available at \url{https://github.com/LeipingJie/SAM2Video}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leipingjie/sam2video
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Infrared Target Detection Methodologies · Face recognition and analysis