Semantic-Aware Ship Detection with Vision-Language Integration
Jiahao Li, Jiancheng Pan, Yuze Sun, Xiaomeng Huang

TL;DR
This paper introduces a novel semantic-aware ship detection framework that integrates vision-language models with a multi-scale sliding window approach, leveraging a new dataset to improve detection accuracy in complex remote sensing scenarios.
Contribution
The paper presents a new framework combining vision-language models with multi-scale strategies and introduces the ShipSem-VL dataset for fine-grained ship attribute detection.
Findings
Enhanced detection accuracy in complex scenes
Effective semantic attribute recognition
Comprehensive evaluation across multiple tasks
Abstract
Ship detection in remote sensing imagery is a critical task with wide-ranging applications, such as maritime activity monitoring, shipping logistics, and environmental studies. However, existing methods often struggle to capture fine-grained semantic information, limiting their effectiveness in complex scenarios. To address these challenges, we propose a novel detection framework that combines Vision-Language Models (VLMs) with a multi-scale adaptive sliding window strategy. To facilitate Semantic-Aware Ship Detection (SASD), we introduce ShipSem-VL, a specialized Vision-Language dataset designed to capture fine-grained ship attributes. We evaluate our framework through three well-defined tasks, providing a comprehensive analysis of its performance and demonstrating its effectiveness in advancing SASD from multiple perspectives.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
