ExStrucTiny: A Benchmark for Schema-Variable Structured Information Extraction from Document Images
Mathieu Sibue, Andres Mu\~noz Garza, Samuel Mensah, Pranav Shetty, Zhiqiang Ma, Xiaomo Liu, Manuela Veloso

TL;DR
ExStrucTiny is a new benchmark dataset designed to evaluate and improve the ability of vision language models to perform flexible, fine-grained structured information extraction from diverse document images.
Contribution
We introduce ExStrucTiny, a comprehensive benchmark dataset that unifies key entity extraction, relation extraction, and visual question answering for document understanding.
Findings
VLMs face challenges with schema adaptation and answer localization.
Open and closed VLMs show limitations on the benchmark.
The dataset covers diverse document types and extraction scenarios.
Abstract
Enterprise documents, such as forms and reports, embed critical information for downstream applications like data archiving, automated workflows, and analytics. Although generalist Vision Language Models (VLMs) perform well on established document understanding benchmarks, their ability to conduct holistic, fine-grained structured extraction across diverse document types and flexible schemas is not well studied. Existing Key Entity Extraction (KEE), Relation Extraction (RE), and Visual Question Answering (VQA) datasets are limited by narrow entity ontologies, simple queries, or homogeneous document types, often overlooking the need for adaptable and structured extraction. To address these gaps, we introduce ExStrucTiny, a new benchmark dataset for structured Information Extraction (IE) from document images, unifying aspects of KEE, RE, and VQA. Built through a novel pipeline combining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Handwritten Text Recognition Techniques
