Learning to Learn from Web Data through Deep Semantic Embeddings

Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas

arXiv:1808.06368·cs.CV·August 21, 2018

Learning to Learn from Web Data through Deep Semantic Embeddings

Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas

PDF

1 Repo

TL;DR

This paper introduces a method for learning multimodal image and text embeddings from Web and Social Media data, enabling semantic image retrieval without supervision and outperforming existing methods on several benchmarks.

Contribution

It presents a novel approach to learn from web data for semantic image retrieval, including a new dataset for benchmarking and analysis of different text embeddings.

Findings

01

Embeddings learned from web data outperform supervised methods in text-based image retrieval.

02

The approach achieves state-of-the-art results on the MIRFlickr dataset.

03

Semantic multimodal retrieval extends beyond classical instance-level retrieval.

Abstract

In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gombru/LearnFromWebData
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.