Fast Zero-Shot Image Tagging

Yang Zhang; Boqing Gong; Mubarak Shah

arXiv:1605.09759·cs.CV·October 19, 2017

Fast Zero-Shot Image Tagging

Yang Zhang, Boqing Gong, Mubarak Shah

PDF

TL;DR

This paper introduces a fast, neural network-based method for zero-shot image tagging that leverages the principal direction in word vector space to identify relevant tags efficiently, outperforming existing methods especially on unseen tags.

Contribution

It proposes a novel approach that estimates the principal direction for image tagging using linear and nonlinear models, enabling rapid and accurate zero-shot tagging.

Findings

01

Runs in constant time per image

02

Achieves superior performance on NUS-WIDE dataset

03

Outperforms baselines on unseen tags

Abstract

The well-known word analogy experiments show that the recent word vectors capture fine-grained linguistic regularities in words by linear vector offsets, but it is unclear how well the simple vector offsets can encode visual regularities over words. We study a particular image-word relevance relation in this paper. Our results show that the word vectors of relevant tags for a given image rank ahead of the irrelevant tags, along a principal direction in the word vector space. Inspired by this observation, we propose to solve image tagging by estimating the principal direction for an image. Particularly, we exploit linear mappings and nonlinear deep neural networks to approximate the principal direction from an input image. We arrive at a quite versatile tagging model. It runs fast given a test image, in constant time w.r.t.\ the training set size. It not only gives superior performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.