An Augmentation Strategy for Visually Rich Documents

Jing Xie; James B. Wendt; Yichao Zhou; Seth Ebner; Sandeep Tata

arXiv:2212.10047·cs.CL·December 23, 2022

An Augmentation Strategy for Visually Rich Documents

Jing Xie, James B. Wendt, Yichao Zhou, Seth Ebner, Sandeep Tata

PDF

Open Access

TL;DR

This paper introduces FieldSwap, a data augmentation method that swaps key phrases in form-like documents to improve information extraction performance when training data is limited.

Contribution

The paper presents a novel augmentation technique, FieldSwap, that enhances extraction accuracy in scarce data scenarios by generating synthetic training examples.

Findings

01

1-7 F1 point improvements in extraction performance

02

Effective in training with only 10-250 documents

03

Applicable to various form-like document types

Abstract

Many business workflows require extracting important fields from form-like documents (e.g. bank statements, bills of lading, purchase orders, etc.). Recent techniques for automating this task work well only when trained with large datasets. In this work we propose a novel data augmentation technique to improve performance when training data is scarce, e.g. 10-250 documents. Our technique, which we call FieldSwap, works by swapping out the key phrases of a source field with the key phrases of a target field to generate new synthetic examples of the target field for use in training. We demonstrate that this approach can yield 1-7 F1 point improvements in extraction performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Web Data Mining and Analysis · Topic Modeling