Maybe you are looking for CroQS: Cross-modal Query Suggestion for Text-to-Image Retrieval

Giacomo Pacini, Fabio Carrara, Nicola Messina, Nicola Tonellotto,, Giuseppe Amato, Fabrizio Falchi

TL;DR

This paper introduces CroQS, a new benchmark and task for query suggestion in cross-modal retrieval, focusing on minimal textual modifications to improve search results, and evaluates various methods including LLMs and captioning models.

Contribution

The paper presents CroQS, a novel dataset and evaluation framework for query suggestion in cross-modal retrieval, addressing a gap in existing research.

Findings

01

LLM-based and captioning-based methods outperform baselines.

02

Methods improve recall on cluster specificity by over 115%.

03

Methods increase representativeness mAP by more than 52%.

Abstract

Query suggestion, a technique widely adopted in information retrieval, enhances system interactivity and the browsing experience of document collections. In cross-modal retrieval, many works have focused on retrieving relevant items from natural language queries, while few have explored query suggestion solutions. In this work, we address query suggestion in cross-modal retrieval, introducing a novel task that focuses on suggesting minimal textual modifications needed to explore visually consistent subsets of the collection, following the premise of ''Maybe you are looking for''. To facilitate the evaluation and development of methods, we present a tailored benchmark named CroQS. This dataset comprises initial queries, grouped result sets, and human-defined suggested queries for each group. We establish dedicated metrics to rigorously evaluate the performance of various methods on this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Ruggero1912/CroQS-Benchmark
dataset· 74 dl
74 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Semantic Web and Ontologies