Webly Supervised Semantic Embeddings for Large Scale Zero-Shot Learning
Yannick Le Cacheux, Adrian Popescu, Herv\'e Le Borgne

TL;DR
This paper introduces a method for improving zero-shot learning by using noisy textual metadata from images to create more accurate semantic class prototypes, leading to state-of-the-art results on large datasets.
Contribution
It proposes a novel approach to design semantic prototypes from noisy text data, enhancing large-scale zero-shot learning performance.
Findings
Significant performance improvement over baselines.
Robust semantic prototypes via source-based voting.
State-of-the-art results on ImageNet.
Abstract
Zero-shot learning (ZSL) makes object recognition in images possible in absence of visual training data for a part of the classes from a dataset. When the number of classes is large, classes are usually represented by semantic class prototypes learned automatically from unannotated text collections. This typically leads to much lower performances than with manually designed semantic prototypes such as attributes. While most ZSL works focus on the visual aspect and reuse standard semantic prototypes learned from generic text collections, we focus on the problem of semantic class prototype design for large scale ZSL. More specifically, we investigate the use of noisy textual metadata associated to photos as text collections, as we hypothesize they are likely to provide more plausible semantic embeddings for visual classes if exploited appropriately. We thus make use of a source-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI
