Benchmarking Human and Automated Prompting in the Segment Anything Model

Jorge Quesada; Zoe Fowler; Mohammad Alotaibi; Mohit Prabhushankar and; Ghassan AlRegib

arXiv:2410.22048·cs.CV·November 1, 2024

Benchmarking Human and Automated Prompting in the Segment Anything Model

Jorge Quesada, Zoe Fowler, Mohammad Alotaibi, Mohit Prabhushankar and, Ghassan AlRegib

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper benchmarks human versus automated prompts in the Segment Anything Model, revealing a performance gap and exploring how finetuning and prompt features can enhance segmentation accuracy.

Contribution

It introduces a comprehensive benchmarking framework comparing human and automated prompts, and demonstrates how finetuning improves automated prompt effectiveness.

Findings

01

Humans outperform automated prompts by approximately 29%.

02

Automated prompt performance can be improved by up to 68% through finetuning.

03

Identifies features with $R^2$ scores over 0.5 that influence prompting performance.

Abstract

The remarkable capabilities of the Segment Anything Model (SAM) for tackling image segmentation tasks in an intuitive and interactive manner has sparked interest in the design of effective visual prompts. Such interest has led to the creation of automated point prompt selection strategies, typically motivated from a feature extraction perspective. However, there is still very little understanding of how appropriate these automated visual prompting strategies are, particularly when compared to humans, across diverse image domains. Additionally, the performance benefits of including such automated visual prompting strategies within the finetuning process of SAM also remains unexplored, as does the effect of interpretable factors like distance between the prompt points on segmentation performance. To bridge these gaps, we leverage a recently released visual prompting dataset, PointPrompt,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

olivesgatech/pointprompt
pytorchOfficial

Datasets

gOLIVES/SAM_PointPrompt_Dataset
dataset· 18 dl
18 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety

MethodsSegment Anything Model