Prompting Foundation Models for Zero-Shot Ship Instance Segmentation in SAR Imagery
Islam Mansour, Francescopaolo Sica, and Michael Schmitt

TL;DR
This paper presents a zero-shot approach for ship instance segmentation in SAR imagery using foundation models, combining a detector and SAM2 without pixel-level annotations, achieving high accuracy.
Contribution
It introduces a novel method that leverages a SAR-trained detector to prompt foundation models for segmentation, eliminating the need for mask annotations.
Findings
Achieves a mean IoU of 0.637 on SSDD benchmark
Detects ships with an 89.2% rate
Reduces reliance on pixel-level annotations
Abstract
Synthetic Aperture Radar (SAR) plays a critical role in maritime surveillance, yet deep learning for SAR analysis is limited by the lack of pixel-level annotations. This paper explores how general-purpose vision foundation models can enable zero-shot ship instance segmentation in SAR imagery, eliminating the need for pixel-level supervision. A YOLOv11-based detector trained on open SAR datasets localizes ships via bounding boxes, which then prompt the Segment Anything Model 2 (SAM2) to produce instance masks without any mask annotations. Unlike prior SAM-based SAR approaches that rely on fine tuning or adapters, our method demonstrates that spatial constraints from a SAR-trained detector alone can effectively regularize foundation model predictions. This design partially mitigates the optical-SAR domain gap and enables downstream applications such as vessel classification, size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
