Query Performance Prediction using Relevance Judgments Generated by Large Language Models

Chuan Meng; Negar Arabzadeh; Arian Askari; Mohammad Aliannejadi; Maarten de Rijke

arXiv:2404.01012·cs.IR·May 27, 2025·3 cites

Query Performance Prediction using Relevance Judgments Generated by Large Language Models

Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel query performance prediction framework using large language models to generate relevance judgments, enabling more accurate, interpretable, and flexible IR measure predictions without human relevance judgments.

Contribution

It proposes a new QPP method that decomposes prediction into relevance of individual items, allowing for flexible IR measure prediction and improved interpretability using open-source LLMs.

Findings

01

Achieves state-of-the-art QPP quality on multiple datasets.

02

Effectively predicts various IR evaluation measures.

03

Improves interpretability of QPP results.

Abstract

Query performance prediction (QPP) aims to estimate the retrieval quality of a search system for a query without human relevance judgments. Previous QPP methods typically return a single scalar value and do not require the predicted values to approximate a specific information retrieval (IR) evaluation measure, leading to certain drawbacks: (i) a single scalar is insufficient to accurately represent different IR evaluation measures, especially when metrics do not highly correlate, and (ii) a single scalar limits the interpretability of QPP methods because solely using a scalar is insufficient to explain QPP results. To address these issues, we propose a QPP framework using automatically generated relevance judgments (QPP-GenRE), which decomposes QPP into independent subtasks of predicting the relevance of each item in a ranked list to a given query. This allows us to predict any IR…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chuanmeng/qpp-genre
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Web Data Mining and Analysis · Data Management and Algorithms

MethodsLLaMA