Using Sentences as Semantic Representations in Large Scale Zero-Shot   Learning

Yannick Le Cacheux; Herv\'e Le Borgne; Michel Crucianu

arXiv:2010.02959·cs.CV·October 8, 2020

Using Sentences as Semantic Representations in Large Scale Zero-Shot Learning

Yannick Le Cacheux, Herv\'e Le Borgne, Michel Crucianu

PDF

Open Access

TL;DR

This paper investigates using short natural language sentences as semantic representations in zero-shot learning, demonstrating that combining sentences with word embeddings significantly improves recognition of unseen classes.

Contribution

It introduces methods to incorporate short sentences as class descriptions in ZSL and shows their effectiveness when combined with word embeddings.

Findings

01

Combining sentences with word embeddings outperforms state-of-the-art methods.

02

Simple sentence-based methods alone are less effective.

03

The approach scales better to large datasets.

Abstract

Zero-shot learning aims to recognize instances of unseen classes, for which no visual instance is available during training, by learning multimodal relations between samples from seen classes and corresponding class semantic representations. These class representations usually consist of either attributes, which do not scale well to large datasets, or word embeddings, which lead to poorer performance. A good trade-off could be to employ short sentences in natural language as class descriptions. We explore different solutions to use such short descriptions in a ZSL setting and show that while simple methods cannot achieve very good results with sentences alone, a combination of usual word embeddings and sentences can significantly outperform current state-of-the-art.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling