Retrieving Versus Understanding Extractive Evidence in Few-Shot Learning

Karl Elbakian; Samuel Carton

arXiv:2502.14095·cs.CL·February 21, 2025

Retrieving Versus Understanding Extractive Evidence in Few-Shot Learning

Karl Elbakian, Samuel Carton

PDF

Open Access

TL;DR

This paper investigates how large language models retrieve and interpret evidence in few-shot learning, revealing a strong link between retrieval errors and prediction errors, but less so with interpretation errors.

Contribution

It provides empirical analysis of the relationship between evidence retrieval and understanding in large language models, highlighting areas for improving model alignment.

Findings

01

Strong correlation between retrieval errors and prediction errors

02

Retrieval errors are mostly not linked to evidence interpretation errors

03

Insights applicable to downstream tasks and model alignment

Abstract

A key aspect of alignment is the proper use of within-document evidence to construct document-level decisions. We analyze the relationship between the retrieval and interpretation of within-document evidence for large language model in a few-shot setting. Specifically, we measure the extent to which model prediction errors are associated with evidence retrieval errors with respect to gold-standard human-annotated extractive evidence for five datasets, using two popular closed proprietary models. We perform two ablation studies to investigate when both label prediction and evidence retrieval errors can be attributed to qualities of the relevant evidence. We find that there is a strong empirical relationship between model prediction and evidence retrieval error, but that evidence retrieval error is mostly not associated with evidence interpretation error--a hopeful sign for downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning