Using Sentences as Semantic Representations in Large Scale Zero-Shot Learning
Yannick Le Cacheux, Herv\'e Le Borgne, Michel Crucianu

TL;DR
This paper investigates using short natural language sentences as semantic representations in zero-shot learning, demonstrating that combining sentences with word embeddings significantly improves recognition of unseen classes.
Contribution
It introduces methods to incorporate short sentences as class descriptions in ZSL and shows their effectiveness when combined with word embeddings.
Findings
Combining sentences with word embeddings outperforms state-of-the-art methods.
Simple sentence-based methods alone are less effective.
The approach scales better to large datasets.
Abstract
Zero-shot learning aims to recognize instances of unseen classes, for which no visual instance is available during training, by learning multimodal relations between samples from seen classes and corresponding class semantic representations. These class representations usually consist of either attributes, which do not scale well to large datasets, or word embeddings, which lead to poorer performance. A good trade-off could be to employ short sentences in natural language as class descriptions. We explore different solutions to use such short descriptions in a ZSL setting and show that while simple methods cannot achieve very good results with sentences alone, a combination of usual word embeddings and sentences can significantly outperform current state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
