"What is the value of {templates}?" Rethinking Document Information   Extraction Datasets for LLMs

Ran Zmigrod; Pranav Shetty; Mathieu Sibue; Zhiqiang Ma; Armineh; Nourbakhsh; Xiaomo Liu; Manuela Veloso

arXiv:2410.15484·cs.CL·December 2, 2024

"What is the value of {templates}?" Rethinking Document Information Extraction Datasets for LLMs

Ran Zmigrod, Pranav Shetty, Mathieu Sibue, Zhiqiang Ma, Armineh, Nourbakhsh, Xiaomo Liu, Manuela Veloso

PDF

Open Access

TL;DR

This paper introduces K2Q, a diverse dataset with complex templates for key information extraction, demonstrating that varied question formats improve the robustness of large language models in document understanding tasks.

Contribution

The work presents K2Q, a novel dataset with diverse, intricate templates for KIE, and empirically shows that question diversity enhances model performance and robustness.

Findings

01

Diverse templates improve model robustness

02

Training on complex templates outperforms simple ones

03

Models benefit from varied question formats

Abstract

The rise of large language models (LLMs) for visually rich document understanding (VRDU) has kindled a need for prompt-response, document-based datasets. As annotating new datasets from scratch is labor-intensive, the existing literature has generated prompt-response datasets from available resources using simple templates. For the case of key information extraction (KIE), one of the most common VRDU tasks, past work has typically employed the template "What is the value for the {key}?". However, given the variety of questions encountered in the wild, simple and uniform templates are insufficient for creating robust models in research and industrial contexts. In this work, we present K2Q, a diverse collection of five datasets converted from KIE to a prompt-response format using a plethora of bespoke templates. The questions in K2Q can span multiple entities and be extractive or boolean.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing · Library Science and Information Systems