EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections

Francesc Net; Lluis Gomez

arXiv:2410.01536·cs.CV·October 4, 2024

EUFCC-CIR: a Composed Image Retrieval Dataset for GLAM Collections

Francesc Net, Lluis Gomez

PDF

Open Access 1 Repo

TL;DR

This paper introduces EUFCC-CIR, a large dataset for Composed Image Retrieval in cultural heritage collections, enabling better AI-driven exploration of GLAM archives.

Contribution

It provides a novel, extensive CIR dataset tailored for Digital Humanities, filling a gap in resources for cultural heritage image retrieval.

Findings

01

EUFCC-CIR contains over 180K annotated triplets.

02

The dataset demonstrates unique qualities compared to existing CIR datasets.

03

Zero-shot CIR baselines show promising performance on EUFCC-CIR.

Abstract

The intersection of Artificial Intelligence and Digital Humanities enables researchers to explore cultural heritage collections with greater depth and scale. In this paper, we present EUFCC-CIR, a dataset designed for Composed Image Retrieval (CIR) within Galleries, Libraries, Archives, and Museums (GLAM) collections. Our dataset is built on top of the EUFCC-340K image labeling dataset and contains over 180K annotated CIR triplets. Each triplet is composed of a multi-modal query (an input image plus a short text describing the desired attribute manipulations) and a set of relevant target images. The EUFCC-CIR dataset fills an existing gap in CIR-specific resources for Digital Humanities. We demonstrate the value of the EUFCC-CIR dataset by highlighting its unique qualities in comparison to other existing CIR datasets and evaluating the performance of several zero-shot CIR baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cesc47/eufcc-cir
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Colorectal Cancer Screening and Detection

MethodsSparse Evolutionary Training