UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world   Document Analysis

Yulong Hui; Yao Lu; Huanchen Zhang

arXiv:2406.15187·cs.AI·November 1, 2024·1 cites

UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis

Yulong Hui, Yao Lu, Huanchen Zhang

PDF

Open Access 1 Repo 5 Datasets 1 Video

TL;DR

This paper introduces UDA, a comprehensive benchmark suite with real-world documents and questions, to evaluate and improve retrieval-augmented generation methods for complex document analysis tasks.

Contribution

The paper presents UDA, a new benchmark suite with extensive real-world data for evaluating RAG-based solutions in document analysis.

Findings

01

Data parsing and retrieval are crucial for performance.

02

Evaluation reveals strengths and weaknesses of current RAG methods.

03

Benchmark facilitates future research in real-world document analysis.

Abstract

The use of Retrieval-Augmented Generation (RAG) has improved Large Language Models (LLMs) in collaborating with external data, yet significant challenges exist in real-world scenarios. In areas such as academic literature and finance question answering, data are often found in raw text and tables in HTML or PDF formats, which can be lengthy and highly unstructured. In this paper, we introduce a benchmark suite, namely Unstructured Document Analysis (UDA), that involves 2,965 real-world documents and 29,590 expert-annotated Q&A pairs. We revisit popular LLM- and RAG-based solutions for document analysis and evaluate the design choices and answer qualities across multiple document domains and diverse query types. Our evaluation yields interesting findings and highlights the importance of data parsing and retrieval. We hope our benchmark can shed light and better serve real-world document…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qinchuanhui/uda-benchmark
noneOfficial

Datasets

Videos

UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-World Document Analysis· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis