Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments

Aniket Bhattacharyya; Anurag Tripathi; Ujjal Das; Archan Karmakar; Amit Pathak; Maneesh Gupta

arXiv:2505.13535·cs.IR·May 21, 2025

Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments

Aniket Bhattacharyya, Anurag Tripathi, Ujjal Das, Archan Karmakar, Amit Pathak, Maneesh Gupta

PDF

Open Access

TL;DR

This paper introduces BLOCKIE, an LLM-based method that segments visually rich documents into semantic blocks for improved information extraction, demonstrating better accuracy and generalization over existing approaches.

Contribution

The paper presents a novel approach that organizes VRDs into independent semantic blocks, enhancing reasoning and generalization in information extraction tasks.

Findings

01

Outperforms state-of-the-art in VRD benchmarks by 1-3% F1 score

02

Resilient to unseen document formats

03

Capable of extracting implicit information

Abstract

Information extraction (IE) from Visually Rich Documents (VRDs) containing layout features along with text is a critical and well-studied task. Specialized non-LLM NLP-based solutions typically involve training models using both textual and geometric information to label sequences/tokens as named entities or answers to specific questions. However, these approaches lack reasoning, are not able to infer values not explicitly present in documents, and do not generalize well to new formats. Generative LLM-based approaches proposed recently are capable of reasoning, but struggle to comprehend clues from document layout especially in previously unseen document formats, and do not show competitive performance in heterogeneous VRD benchmark datasets. In this paper, we propose BLOCKIE, a novel LLM-based approach that organizes VRDs into localized, reusable semantic textual segments called…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Semantic Web and Ontologies · Advanced Computational Techniques and Applications