Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use
Franz Louis Cesista, Rui Aguiar, Jason Kim, Paolo Acilo

TL;DR
This paper introduces Retrieval Augmented Structured Generation (RASG), a novel framework for Business Document Information Extraction that leverages tool use modeling, achieving state-of-the-art results with large language models.
Contribution
The paper presents RASG as a new approach for BDIE, demonstrates its effectiveness with LLMs surpassing multimodal models, and introduces a practical metric and heuristic for line item recognition.
Findings
RASG achieves SOTA results on BDIE benchmarks.
LLMs with RASG outperform current multimodal models.
Proposed GLIRM metric aligns better with real-world BDIE use cases.
Abstract
Business Document Information Extraction (BDIE) is the problem of transforming a blob of unstructured information (raw text, scanned documents, etc.) into a structured format that downstream systems can parse and use. It has two main tasks: Key-Information Extraction (KIE) and Line Items Recognition (LIR). In this paper, we argue that BDIE is best modeled as a Tool Use problem, where the tools are these downstream systems. We then present Retrieval Augmented Structured Generation (RASG), a novel general framework for BDIE that achieves state of the art (SOTA) results on both KIE and LIR tasks on BDIE benchmarks. The contributions of this paper are threefold: (1) We show, with ablation benchmarks, that Large Language Models (LLMs) with RASG are already competitive with or surpasses current SOTA Large Multimodal Models (LMMs) without RASG on BDIE benchmarks. (2) We propose a new metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Web Data Mining and Analysis
