Dual-Granularity Semantic Prompting for Language Guidance Infrared Small Target Detection
Zixuan Wang, Haoran Sun, Jiaming Lu, Wenxuan Wang, Zhongling Huang, Dingwen Zhang, Xuelin Qian, Junwei Han

TL;DR
This paper introduces DGSPNet, a novel language prompt-driven framework for infrared small target detection that leverages dual-granularity semantic prompts and attention mechanisms to improve accuracy without manual annotations.
Contribution
The paper proposes a new end-to-end framework integrating dual-granularity semantic prompts and attention mechanisms, enabling effective language-guided infrared small target detection without manual annotations.
Findings
Significantly improves detection accuracy on benchmark datasets.
Achieves state-of-the-art performance in infrared small target detection.
Effectively leverages language prompts for enhanced feature sensitivity.
Abstract
Infrared small target detection remains challenging due to limited feature representation and severe background interference, resulting in sub-optimal performance. While recent CLIP-inspired methods attempt to leverage textual guidance for detection, they are hindered by inaccurate text descriptions and reliance on manual annotations. To overcome these limitations, we propose DGSPNet, an end-to-end language prompt-driven framework. Our approach integrates dual-granularity semantic prompts: coarse-grained textual priors (e.g., 'infrared image', 'small target') and fine-grained personalized semantic descriptions derived through visual-to-textual mapping within the image space. This design not only facilitates learning fine-grained semantic information but also can inherently leverage language prompts during inference without relying on any annotation requirements. By fully leveraging the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Infrared Target Detection Methodologies
