LLMs can be Fooled into Labelling a Document as Relevant (best caf\'e near me; this paper is perfectly relevant)
Marwah Alaofi, Paul Thomas, Falk Scholer, Mark Sanderson

TL;DR
This paper investigates how large language models (LLMs) can be easily fooled into labeling irrelevant passages as relevant, especially when query words are present, revealing vulnerabilities in current relevance assessment methods.
Contribution
The study uncovers the susceptibility of LLMs to manipulation through query word presence and deliberate instruction, highlighting potential biases and weaknesses in relevance labeling.
Findings
LLMs often label passages with query words as relevant, regardless of actual relevance.
Presence of query words heavily influences LLM relevance judgments.
Manipulating LLM instructions can alter their labeling behavior.
Abstract
LLMs are increasingly being used to assess the relevance of information objects. This work reports on experiments to study the labelling of short texts (i.e., passages) for relevance, using multiple open-source and proprietary LLMs. While the overall agreement of some LLMs with human judgements is comparable to human-to-human agreement measured in previous research, LLMs are more likely to label passages as relevant compared to human judges, indicating that LLM labels denoting non-relevance are more reliable than those indicating relevance. This observation prompts us to further examine cases where human judges and LLMs disagree, particularly when the human judge labels the passage as non-relevant and the LLM labels it as relevant. Results show a tendency for many LLMs to label passages that include the original query terms as relevant. We, therefore, conduct experiments to inject…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
