FiCo-ITR: bridging fine-grained and coarse-grained image-text retrieval for comparative performance analysis

Mikel Williams-Lekuona; Georgina Cosma

arXiv:2407.20114·cs.IR·January 19, 2026

FiCo-ITR: bridging fine-grained and coarse-grained image-text retrieval for comparative performance analysis

Mikel Williams-Lekuona, Georgina Cosma

PDF

1 Repo

TL;DR

This paper introduces FiCo-ITR, a standardized evaluation framework for comparing fine-grained and coarse-grained image-text retrieval models, providing insights into their performance and efficiency trade-offs.

Contribution

It presents the FiCo-ITR library for standardized evaluation, enabling direct comparison of FG and CG models in image-text retrieval tasks.

Findings

01

FG models achieve higher accuracy but are more computationally intensive.

02

CG models are more efficient but less precise.

03

Empirical analysis reveals trade-offs guiding model choice.

Abstract

In the field of Image-Text Retrieval (ITR), recent advancements have leveraged large-scale Vision-Language Pretraining (VLP) for Fine-Grained (FG) instance-level retrieval, achieving high accuracy at the cost of increased computational complexity. For Coarse-Grained (CG) category-level retrieval, prominent approaches employ Cross-Modal Hashing (CMH) to prioritise efficiency, albeit at the cost of retrieval performance. Due to differences in methodologies, FG and CG models are rarely compared directly within evaluations in the literature, resulting in a lack of empirical data quantifying the retrieval performance-efficiency tradeoffs between the two. This paper addresses this gap by introducing the FiCo-ITR library, which standardises evaluation methodologies for both FG and CG models, facilitating direct comparisons. We conduct empirical evaluations of representative models from both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mikelwl/fico-itr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.