Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge   for Generic Image Representations

Nikolaos-Antonios Ypsilantis; Kaifeng Chen; Bingyi Cao; M\'ario; Lipovsk\'y; Pelin Dogan-Sch\"onberger; Grzegorz Makosa; Boris Bluntschli,; Mojtaba Seyedhosseini; Ond\v{r}ej Chum; Andr\'e Araujo

arXiv:2309.01858·cs.CV·September 6, 2023

Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations

Nikolaos-Antonios Ypsilantis, Kaifeng Chen, Bingyi Cao, M\'ario, Lipovsk\'y, Pelin Dogan-Sch\"onberger, Grzegorz Makosa, Boris Bluntschli,, Mojtaba Seyedhosseini, Ond\v{r}ej Chum, Andr\'e Araujo

PDF

Open Access

TL;DR

This paper introduces a large-scale benchmark dataset and challenge for developing universal image embeddings capable of performing well across multiple domains, addressing the limitations of domain-specific models.

Contribution

It constructs a comprehensive dataset and evaluation protocol for universal image embeddings and provides extensive experimental analysis and a global research competition to advance this field.

Findings

01

Existing approaches underperform compared to domain-specific models.

02

Simple extensions of current methods do not significantly improve universal embedding performance.

03

The research competition attracted over 1,000 teams, fostering new ideas.

Abstract

Fine-grained and instance-level recognition methods are commonly trained and evaluated on specific domains, in a model per domain scenario. Such an approach, however, is impractical in real large-scale applications. In this work, we address the problem of universal image embedding, where a single universal model is trained and used in multiple domains. First, we leverage existing domain-specific datasets to carefully construct a new large-scale public benchmark for the evaluation of universal image embeddings, with 241k query images, 1.4M index images and 2.8M training images across 8 different domains and 349k classes. We define suitable metrics, training and evaluation protocols to foster future research in this area. Second, we provide a comprehensive experimental evaluation on the new dataset, demonstrating that existing approaches and simplistic extensions lead to worse performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI