UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis
Yulong Hui, Yao Lu, Huanchen Zhang

TL;DR
This paper introduces UDA, a comprehensive benchmark suite with real-world documents and questions, to evaluate and improve retrieval-augmented generation methods for complex document analysis tasks.
Contribution
The paper presents UDA, a new benchmark suite with extensive real-world data for evaluating RAG-based solutions in document analysis.
Findings
Data parsing and retrieval are crucial for performance.
Evaluation reveals strengths and weaknesses of current RAG methods.
Benchmark facilitates future research in real-world document analysis.
Abstract
The use of Retrieval-Augmented Generation (RAG) has improved Large Language Models (LLMs) in collaborating with external data, yet significant challenges exist in real-world scenarios. In areas such as academic literature and finance question answering, data are often found in raw text and tables in HTML or PDF formats, which can be lengthy and highly unstructured. In this paper, we introduce a benchmark suite, namely Unstructured Document Analysis (UDA), that involves 2,965 real-world documents and 29,590 expert-annotated Q&A pairs. We revisit popular LLM- and RAG-based solutions for document analysis and evaluate the design choices and answer qualities across multiple document domains and diverse query types. Our evaluation yields interesting findings and highlights the importance of data parsing and retrieval. We hope our benchmark can shed light and better serve real-world document…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
