DSAA: Dual-Stage Attribute Activation for Fine-grained Open Vocabulary Detection

Donghong Jiang,Endian Lin,Hanqing Liu,Mingjie Liu,Luoping Cui,Zhao Yang,Chuang Zhu

arXiv:2605.18023·cs.CV·May 19, 2026

DSAA: Dual-Stage Attribute Activation for Fine-grained Open Vocabulary Detection

Donghong Jiang,Endian Lin,Hanqing Liu,Mingjie Liu,Luoping Cui,Zhao Yang,Chuang Zhu

PDF

TL;DR

This paper introduces DSAA, a dual-stage framework that enhances fine-grained attribute detection in open-vocabulary object detection models by strengthening attribute semantics during inference and training.

Contribution

The paper proposes the DSAA framework with novel modules and loss to improve attribute recognition in open-vocabulary detection models.

Findings

01

Improved attribute detection accuracy on FG-OVD benchmark.

02

Enhanced discrimination among instances with different attributes.

03

Effective across various open-vocabulary models.

Abstract

Open-Vocabulary Object Detection (OVD) models break the limitations of closed-set detection, enabling the iden- tification of unseen categories through natural language prompts. However, they exhibit notable limitations in fine- grained detection tasks involving attributes like color, ma- terial, and texture. We attribute this performance bottle- neck in OVD models to a core issue: when category sig- nals dominate, OVD models tend to marginalize attribute information during inference. This leads to incorrect bind- ing between attributes and target objects. To address this, we propose the Dual-Stage Attribute Activation (DSAA) framework, which enhances fine-grained detection capa- bilities by strengthening attribute semantics at two criti- cal stages. In the text embedding stage, we employ At- tribute Prefix Adapter (APA) module to generate attribute prefixes that inject explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.