What Remains of Visual Semantic Embeddings

Yue Jiao; Jonathon Hare; Adam Pr\"ugel-Bennett

arXiv:2107.11991·cs.CV·July 27, 2021

What Remains of Visual Semantic Embeddings

Yue Jiao, Jonathon Hare, Adam Pr\"ugel-Bennett

PDF

Open Access

TL;DR

This paper evaluates how well current visual semantic embedding models encode semantic information in zero-shot learning, introducing a fair benchmark and revealing their limitations in capturing semantic relationships.

Contribution

It introduces a new ZSL benchmark using split tiered-ImageNet and a unified contrastive learning framework to fairly evaluate semantic encoding capabilities.

Findings

01

Current ZSL models struggle with semantic relationships.

02

The new benchmark avoids structural flaws of standard ImageNet.

03

Encourages exploration of contextual language representations in ZSL.

Abstract

Zero shot learning (ZSL) has seen a surge in interest over the decade for its tight links with the mechanism making young children recognize novel objects. Although different paradigms of visual semantic embedding models are designed to align visual features and distributed word representations, it is unclear to what extent current ZSL models encode semantic information from distributed word representations. In this work, we introduce the split of tiered-ImageNet to the ZSL task, in order to avoid the structural flaws in the standard ImageNet benchmark. We build a unified framework for ZSL with contrastive learning as pre-training, which guarantees no semantic information leakage and encourages linearly separable visual features. Our work makes it fair for evaluating visual semantic embedding models on a ZSL setting in which semantic inference is decisive. With this framework, we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling

MethodsContrastive Learning