TL;DR
This paper introduces VSGMN, a graph-based model that leverages class relationships to improve zero-shot learning by aligning visual and semantic embeddings through a two-stage graph matching process.
Contribution
It proposes a novel two-stage graph matching network that incorporates class relationships for more robust visual-semantic embedding in zero-shot learning.
Findings
VSGMN outperforms existing methods on benchmark datasets.
The two-stage graph matching improves alignment accuracy.
Incorporating class relationships enhances zero-shot recognition performance.
Abstract
Zero-shot learning (ZSL) aims to leverage additional semantic information to recognize unseen classes. To transfer knowledge from seen to unseen classes, most ZSL methods often learn a shared embedding space by simply aligning visual embeddings with semantic prototypes. However, methods trained under this paradigm often struggle to learn robust embedding space because they align the two modalities in an isolated manner among classes, which ignore the crucial class relationship during the alignment process. To address the aforementioned challenges, this paper proposes a Visual-Semantic Graph Matching Net, termed as VSGMN, which leverages semantic relationships among classes to aid in visual-semantic embedding. VSGMN employs a Graph Build Network (GBN) and a Graph Matching Network (GMN) to achieve two-stage visual-semantic alignment. Specifically, GBN first utilizes an embedding-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
