Evaluation of Segment Anything Model 2: The Role of SAM2 in the Underwater Environment
Shijie Lian, Hua Li

TL;DR
This paper evaluates the performance of the advanced Segment Anything Model 2 (SAM2) in underwater environments, highlighting its strengths with ground truth prompts and limitations with automatic prompts in marine image segmentation tasks.
Contribution
It provides the first comprehensive assessment of SAM2's effectiveness in underwater segmentation, emphasizing prompt dependency and potential for marine science applications.
Findings
SAM2 performs well with ground truth bounding box prompts
Automatic mode with point prompts shows degraded performance underwater
Evaluation codes and results are publicly available
Abstract
With breakthroughs in large-scale modeling, the Segment Anything Model (SAM) and its extensions have been attempted for applications in various underwater visualization tasks in marine sciences, and have had a significant impact on the academic community. Recently, Meta has further developed the Segment Anything Model 2 (SAM2), which significantly improves running speed and segmentation accuracy compared to its predecessor. This report aims to explore the potential of SAM2 in marine science by evaluating it on the underwater instance segmentation benchmark datasets UIIS and USIS10K. The experiments show that the performance of SAM2 is extremely dependent on the type of user-provided prompts. When using the ground truth bounding box as prompt, SAM2 performed excellently in the underwater instance segmentation domain. However, when running in automatic mode, SAM2's ability with point…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman-Automation Interaction and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Segment Anything Model
