Region Semantically Aligned Network for Zero-Shot Learning
Ziyang Wang, Yunhao Gou, Jingjing Li, Yu Zhang, Yang Yang

TL;DR
This paper introduces RSAN, a novel zero-shot learning model that aligns local image regions with semantic attributes, improving recognition of unseen classes by leveraging region-specific features rather than global features.
Contribution
The paper proposes a region-based semantic alignment approach for ZSL that directly maps local image features to attributes, enhancing knowledge transfer to unseen classes.
Findings
RSAN outperforms state-of-the-art ZSL methods on standard datasets.
Utilizing local features improves recognition accuracy for unseen classes.
Attribute regression regularizes the encoder for more robust features.
Abstract
Zero-shot learning (ZSL) aims to recognize unseen classes based on the knowledge of seen classes. Previous methods focused on learning direct embeddings from global features to the semantic space in hope of knowledge transfer from seen classes to unseen classes. However, an unseen class shares local visual features with a set of seen classes and leveraging global visual features makes the knowledge transfer ineffective. To tackle this problem, we propose a Region Semantically Aligned Network (RSAN), which maps local features of unseen classes to their semantic attributes. Instead of using global features which are obtained by an average pooling layer after an image encoder, we directly utilize the output of the image encoder which maintains local information of the image. Concretely, we obtain each attribute from a specific region of the output and exploit these attributes for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAverage Pooling
