Evaluating Cost-Accuracy Trade-offs in Multimodal Search Relevance   Judgements

Silvia Terragni; Hoang Cuong; Joachim Daiber; Pallavi Gudipati; and; Pablo N. Mendes

arXiv:2410.19974·cs.LG·October 29, 2024

Evaluating Cost-Accuracy Trade-offs in Multimodal Search Relevance Judgements

Silvia Terragni, Hoang Cuong, Joachim Daiber, Pallavi Gudipati, and, Pablo N. Mendes

PDF

Open Access

TL;DR

This paper evaluates various large language and multimodal models for search relevance, analyzing their cost-accuracy trade-offs and context-dependent performance to guide practical model selection.

Contribution

It provides a comprehensive assessment of LLMs and MLLMs in multimodal search relevance, highlighting performance variability and cost considerations.

Findings

01

Model performance varies significantly across contexts.

02

Including visual components may reduce smaller model effectiveness.

03

Performance trade-offs depend on specific use cases.

Abstract

Large Language Models (LLMs) have demonstrated potential as effective search relevance evaluators. However, there is a lack of comprehensive guidance on which models consistently perform optimally across various contexts or within specific use cases. In this paper, we assess several LLMs and Multimodal Language Models (MLLMs) in terms of their alignment with human judgments across multiple multimodal search scenarios. Our analysis investigates the trade-offs between cost and accuracy, highlighting that model performance varies significantly depending on the context. Interestingly, in smaller models, the inclusion of a visual component may hinder performance rather than enhance it. These findings highlight the complexities involved in selecting the most appropriate model for practical applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Information Retrieval and Search Behavior · Sentiment Analysis and Opinion Mining