CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems
Sara Rosenthal, Avirup Sil, Radu Florian, Salim Roukos

TL;DR
ClapNQ introduces a benchmark dataset for evaluating retrieval-augmented generation models on long-form, cohesive answers grounded in passages, emphasizing fluency and integration of non-contiguous information.
Contribution
The paper presents ClapNQ, a new dataset and benchmark for assessing RAG systems' ability to generate concise, cohesive, and passage-grounded long answers.
Findings
Baseline models show significant room for improvement.
Answers are three times shorter than passages but maintain coherence.
The benchmark highlights challenges in passage retrieval and answer generation.
Abstract
Retrieval Augmented Generation (RAG) has become a popular application for large language models. It is preferable that successful RAG systems provide accurate answers that are supported by being grounded in a passage without any hallucinations. While considerable work is required for building a full RAG pipeline, being able to benchmark performance is also necessary. We present ClapNQ, a benchmark Long-form Question Answering dataset for the full RAG pipeline. ClapNQ includes long answers with grounded gold passages from Natural Questions (NQ) and a corpus to perform either retrieval, generation, or the full RAG pipeline. The ClapNQ answers are concise, 3x smaller than the full passage, and cohesive, meaning that the answer is composed fluently, often by integrating multiple pieces of the passage that are not contiguous. RAG models must adapt to these properties to be successful at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
MethodsAttention Is All You Need · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax · WordPiece · Linear Layer · Dense Connections · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay
