IFShip: Interpretable Fine-grained Ship Classification with Domain Knowledge-Enhanced Vision-Language Models
Mingning Guo, Mengwei Wu, Yuxiang Shen, Haifeng Li, Chao Tao

TL;DR
IFShip introduces an interpretable vision-language model for fine-grained ship classification that leverages domain knowledge and step-by-step reasoning, outperforming existing methods in accuracy and interpretability.
Contribution
The paper presents a novel domain knowledge-enhanced Chain-of-Thought prompt mechanism and a semi-automatic dataset for adapting vision-language models to fine-grained ship classification.
Findings
IFShip outperforms state-of-the-art FGSC algorithms in accuracy and interpretability.
The model provides natural language explanations for its classifications.
IFShip demonstrates superior performance compared to other vision-language models like LLaVA and MiniGPT-4.
Abstract
End-to-end interpretation currently dominates the remote sensing fine-grained ship classification (RS-FGSC) task. However, the inference process remains uninterpretable, leading to criticisms of these models as "black box" systems. To address this issue, we propose a domain knowledge-enhanced Chain-of-Thought (CoT) prompt generation mechanism, which is used to semi-automatically construct a task-specific instruction-following dataset, TITANIC-FGS. By training on TITANIC-FGS, we adapt general-domain vision-language models (VLMs) to the FGSC task, resulting in a model named IFShip. Building upon IFShip, we develop an FGSC visual chatbot that redefines the FGSC problem as a step-by-step reasoning task and conveys the reasoning process in natural language. Experimental results show that IFShip outperforms state-of-the-art FGSC algorithms in both interpretability and classification accuracy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOil and Gas Production Techniques · Maritime Navigation and Safety · Advanced Data Processing Techniques
