Semantic-Aware Ship Detection with Vision-Language Integration

Jiahao Li; Jiancheng Pan; Yuze Sun; Xiaomeng Huang

arXiv:2508.15930·cs.CV·August 25, 2025

Semantic-Aware Ship Detection with Vision-Language Integration

Jiahao Li, Jiancheng Pan, Yuze Sun, Xiaomeng Huang

PDF

TL;DR

This paper introduces a novel semantic-aware ship detection framework that integrates vision-language models with a multi-scale sliding window approach, leveraging a new dataset to improve detection accuracy in complex remote sensing scenarios.

Contribution

The paper presents a new framework combining vision-language models with multi-scale strategies and introduces the ShipSem-VL dataset for fine-grained ship attribute detection.

Findings

01

Enhanced detection accuracy in complex scenes

02

Effective semantic attribute recognition

03

Comprehensive evaluation across multiple tasks

Abstract

Ship detection in remote sensing imagery is a critical task with wide-ranging applications, such as maritime activity monitoring, shipping logistics, and environmental studies. However, existing methods often struggle to capture fine-grained semantic information, limiting their effectiveness in complex scenarios. To address these challenges, we propose a novel detection framework that combines Vision-Language Models (VLMs) with a multi-scale adaptive sliding window strategy. To facilitate Semantic-Aware Ship Detection (SASD), we introduce ShipSem-VL, a specialized Vision-Language dataset designed to capture fine-grained ship attributes. We evaluate our framework through three well-defined tasks, providing a comprehensive analysis of its performance and demonstrating its effectiveness in advancing SASD from multiple perspectives.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.