An Evaluation Framework for Attributed Information Retrieval using Large   Language Models

Hanane Djeddal; Pierre Erbacher; Raouf Toukal; Laure Soulier; Karen; Pinel-Sauvagnat; Sophia Katrenko; Lynda Tamine

arXiv:2409.08014·cs.IR·September 13, 2024

An Evaluation Framework for Attributed Information Retrieval using Large Language Models

Hanane Djeddal, Pierre Erbacher, Raouf Toukal, Laure Soulier, Karen, Pinel-Sauvagnat, Sophia Katrenko, Lynda Tamine

PDF

1 Repo

TL;DR

This paper introduces a reproducible evaluation framework for attributed information retrieval using large language models, addressing the challenges of open-ended queries and diverse answer attribution in information-seeking scenarios.

Contribution

It proposes a flexible benchmarking framework for attributed information seeking with various LLM architectures, enabling systematic evaluation of correctness and attribution.

Findings

01

Different architectural scenarios significantly affect answer correctness.

02

The framework can be applied with any backbone LLM.

03

Experiments highlight the impact of scenario choices on attribution quality.

Abstract

With the growing success of Large Language models (LLMs) in information-seeking scenarios, search engines are now adopting generative approaches to provide answers along with in-line citations as attribution. While existing work focuses mainly on attributed question answering, in this paper, we target information-seeking scenarios which are often more challenging due to the open-ended nature of the queries and the size of the label space in terms of the diversity of candidate-attributed answers per query. We propose a reproducible framework to evaluate and benchmark attributed information seeking, using any backbone LLM, and different architectural designs: (1) Generate (2) Retrieve then Generate, and (3) Generate then Retrieve. Experiments using HAGRID, an attributed information-seeking dataset, show the impact of different scenarios on both the correctness and attributability of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hanane-djeddal/attributed-ir
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.