VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning

Wenjia Xu; Yongqin Xian; Jiuniu Wang; Bernt Schiele; Zeynep Akata

arXiv:2203.10444·cs.CV·May 29, 2023·6 cites

VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning

Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to generate visually-grounded semantic embeddings for zero-shot learning that do not require human annotation, improving the alignment between semantic and visual similarities and enhancing zero-shot performance.

Contribution

The authors propose a novel unsupervised approach to create semantic embeddings that incorporate visual properties, outperforming traditional word embeddings in zero-shot learning tasks.

Findings

01

Our embeddings better reflect visual similarities of classes.

02

The method improves zero-shot learning accuracy across three benchmarks.

03

Visual clustering enhances semantic representations for unseen classes.

Abstract

Human-annotated attributes serve as powerful semantic embeddings in zero-shot learning. However, their annotation process is labor-intensive and needs expert supervision. Current unsupervised semantic embeddings, i.e., word embeddings, enable knowledge transfer between classes. However, word embeddings do not always reflect visual similarities and result in inferior zero-shot performance. We propose to discover semantic embeddings containing discriminative visual properties for zero-shot learning, without requiring any human annotation. Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity, and further imposes their class discrimination and semantic relatedness. To associate these clusters with previously unseen classes, we use external knowledge, e.g., word embeddings and propose a novel class relation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenjiaxu/vgse
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Viral Infections and Outbreaks Research