SAIL: Sample-Centric In-Context Learning for Document Information Extraction
Jinyu Zhang, Zhiyuan You, Jize Wang, Xinyi Le

TL;DR
SAIL introduces a sample-centric in-context learning approach that enhances document information extraction from visually rich documents by leveraging fine-grained textual and layout similarities, outperforming existing training-free methods.
Contribution
The paper proposes a novel SAIL method that combines entity-level textual similarity and layout analysis with a unified prompt template for improved zero-shot document extraction.
Findings
Outperforms training-free baselines on multiple benchmarks.
Achieves results close to full-training methods.
Demonstrates strong generalization across datasets.
Abstract
Document Information Extraction (DIE) aims to extract structured information from Visually Rich Documents (VRDs). Previous full-training approaches have demonstrated strong performance but may struggle with generalization to unseen data. In contrast, training-free methods leverage powerful pre-trained models like Large Language Models (LLMs) to address various downstream tasks with only a few examples. Nonetheless, training-free methods for DIE encounter two primary challenges: (1) understanding the complex relationship between layout and textual elements in VRDs, and (2) providing accurate guidance to pre-trained models. To address these challenges, we propose Sample-centric In-context Learning (SAIL) for DIE. SAIL introduces a fine-grained entity-level textual similarity to facilitate in-depth text analysis by LLMs and incorporates layout similarity to enhance the analysis of layouts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Handwritten Text Recognition Techniques
MethodsBalanced Selection
