INQUIRE: A Natural World Text-to-Image Retrieval Benchmark

Edward Vendrow; Omiros Pantazis; Alexander Shepard; Gabriel Brostow,; Kate E. Jones; Oisin Mac Aodha; Sara Beery; Grant Van Horn

arXiv:2411.02537·cs.CV·November 12, 2024

INQUIRE: A Natural World Text-to-Image Retrieval Benchmark

Edward Vendrow, Omiros Pantazis, Alexander Shepard, Gabriel Brostow,, Kate E. Jones, Oisin Mac Aodha, Sara Beery, Grant Van Horn

PDF

Open Access 1 Repo 3 Datasets 1 Video

TL;DR

INQUIRE introduces a challenging natural world text-to-image retrieval benchmark with a large dataset and expert-level queries, aiming to advance multimodal models for ecological research.

Contribution

The paper presents INQUIRE, a new benchmark with a large dataset and expert queries, to evaluate and improve multimodal models for ecological and biodiversity image retrieval.

Findings

01

Current models struggle with the benchmark, achieving less than 50% mAP@50.

02

Reranking with advanced models improves retrieval performance.

03

The benchmark highlights the need for more nuanced multimodal understanding.

Abstract

We introduce INQUIRE, a text-to-image retrieval benchmark designed to challenge multimodal vision-language models on expert-level queries. INQUIRE includes iNaturalist 2024 (iNat24), a new dataset of five million natural world images, along with 250 expert-level retrieval queries. These queries are paired with all relevant images comprehensively labeled within iNat24, comprising 33,000 total matches. Queries span categories such as species identification, context, behavior, and appearance, emphasizing tasks that require nuanced image understanding and domain expertise. Our benchmark evaluates two core retrieval tasks: (1) INQUIRE-Fullrank, a full dataset ranking task, and (2) INQUIRE-Rerank, a reranking task for refining top-100 retrievals. Detailed evaluation of a range of recent multimodal models demonstrates that INQUIRE poses a significant challenge, with the best models failing to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

inquire-benchmark/INQUIRE
pytorchOfficial

Datasets

Videos

INQUIRE: A Natural World Text-to-Image Retrieval Benchmark· slideslive

Taxonomy

TopicsImage Retrieval and Classification Techniques