Dual Relation Mining Network for Zero-Shot Learning
Jinwei Han, Yingguo Gao, Zhiwen Lin, Ke Yan, Shouhong Ding, Yuan Gao,, Gui-Song Xia

TL;DR
This paper introduces a Dual Relation Mining Network (DRMN) that improves zero-shot learning by enhancing visual-semantic interactions and modeling attribute relationships, achieving state-of-the-art results on standard benchmarks.
Contribution
The paper proposes a novel DRMN framework with dual attention and semantic interaction transformer to better capture visual-semantic and attribute relationships in ZSL.
Findings
Achieves new state-of-the-art on CUB, SUN, and AwA2 benchmarks.
Effectively models semantic attribute relationships with a transformer.
Enhances visual-semantic alignment through dual attention mechanisms.
Abstract
Zero-shot learning (ZSL) aims to recognize novel classes through transferring shared semantic knowledge (e.g., attributes) from seen classes to unseen classes. Recently, attention-based methods have exhibited significant progress which align visual features and attributes via a spatial attention mechanism. However, these methods only explore visual-semantic relationship in the spatial dimension, which can lead to classification ambiguity when different attributes share similar attention regions, and semantic relationship between attributes is rarely discussed. To alleviate the above problems, we propose a Dual Relation Mining Network (DRMN) to enable more effective visual-semantic interactions and learn semantic relationship among attributes for knowledge transfer. Specifically, we introduce a Dual Attention Block (DAB) for visual-semantic relationship mining, which enriches visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsAttention Is All You Need · Dense Connections · Dropout · Label Smoothing · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer
