TL;DR
This paper introduces a hierarchy-based method for embedding images into a semantic space using class hierarchies like WordNet, significantly improving the semantic relevance of image retrieval results.
Contribution
It proposes a deterministic algorithm to compute class centroids from hierarchical class knowledge, enhancing semantic image embeddings for retrieval and other tasks.
Findings
Improved semantic consistency in image retrieval results.
Effective use of class hierarchies like WordNet for embedding.
Significant performance gains on CIFAR-100, NABirds, and ImageNet.
Abstract
Deep neural networks trained for classification have been found to learn powerful image representations, which are also often used for other tasks such as comparing images w.r.t. their visual similarity. However, visual similarity does not imply semantic similarity. In order to learn semantically discriminative features, we propose to map images onto class embeddings whose pair-wise dot products correspond to a measure of semantic similarity between classes. Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e.g., novelty detection or few-shot learning. We introduce a deterministic algorithm for computing the class centroids directly based on prior world-knowledge encoded in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds, and ImageNet show that our learned semantic image embeddings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
