Attribute or Abstain: Large Language Models as Long Document Assistants

Jan Buchmann; Xiao Liu; Iryna Gurevych

arXiv:2407.07799·cs.CL·October 24, 2024

Attribute or Abstain: Large Language Models as Long Document Assistants

Jan Buchmann, Xiao Liu, Iryna Gurevych

PDF

Open Access 1 Repo

TL;DR

This paper introduces LAB, a benchmark for evaluating attribution methods in long document tasks using LLMs, revealing how different approaches impact trust and response quality.

Contribution

It provides the first long document-specific evaluation of attribution methods, comparing their effectiveness across various LLM sizes and tasks.

Findings

01

Citation-based attribution performs best for large, fine-tuned models.

02

Additional retrieval benefits small, prompted models.

03

Evidence quality predicts response quality for simple responses.

Abstract

LLMs can help humans working with long documents, but are known to hallucinate. Attribution can increase trust in LLM responses: The LLM provides evidence that supports its response, which enhances verifiability. Existing approaches to attribution have only been evaluated in RAG settings, where the initial retrieval confounds LLM performance. This is crucially different from the long document setting, where retrieval is not needed, but could help. Thus, a long document specific evaluation of attribution is missing. To fill this gap, we present LAB, a benchmark of 6 diverse long document tasks with attribution, and experiments with different approaches to attribution on 5 LLMs of different sizes. We find that citation, i.e. response generation and evidence extraction in one step, performs best for large and fine-tuned models, while additional retrieval can help for small, prompted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ukplab/arxiv2024-attribute-or-abstain
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Dropout