LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation

Koki Itai; Shunichi Hasegawa; Yuta Yamamoto; Gouki Minegishi; and Masaki Otsuki

arXiv:2603.06198·cs.CL·April 30, 2026

LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation

Koki Itai, Shunichi Hasegawa, Yuta Yamamoto, Gouki Minegishi, and Masaki Otsuki

PDF

1 Repo 1 Datasets

TL;DR

LIT-RAGBench is a comprehensive benchmark designed to evaluate large language models' capabilities in retrieval-augmented generation across multiple complex tasks.

Contribution

It introduces a unified evaluation framework covering multiple aspects of RAG, with a new dataset and scoring method for practical model assessment.

Findings

01

No model exceeds 90% accuracy across categories.

02

The benchmark enables detailed measurement of strengths and weaknesses.

03

It provides a tool for selecting and improving RAG models.

Abstract

Retrieval-Augmented Generation (RAG) is a framework in which a Generator, such as a Large Language Model (LLM), produces answers by retrieving documents from an external collection using a Retriever. In practice, Generators must integrate evidence from long contexts, perform multi-step reasoning, interpret tables, and abstain when evidence is missing. However, existing benchmarks for Generators provide limited coverage, with none enabling simultaneous evaluation of multiple capabilities under unified conditions. To bridge the gap between existing evaluations and practical use, we introduce LIT-RAGBench (the Logic, Integration, Table, Reasoning, and Abstention RAG Generator Benchmark), which defines five categories: Integration, Reasoning, Logic, Table, and Abstention, each further divided into practical evaluation aspects. LIT-RAGBench systematically covers patterns combining multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Koki-Itai/LIT-RAGBench
github

Datasets

neoai-inc/LIT-RAGBench
dataset· 132 dl
132 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.