Improving Document Image Understanding with Reinforcement Finetuning

Bao-Sinh Nguyen; Dung Tien Le; Hieu M. Vu; Tuan Anh D. Nguyen,; Minh-Tien Nguyen; Hung Le

arXiv:2209.12561·cs.IR·September 27, 2022

Improving Document Image Understanding with Reinforcement Finetuning

Bao-Sinh Nguyen, Dung Tien Le, Hieu M. Vu, Tuan Anh D. Nguyen,, Minh-Tien Nguyen, Hung Le

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based finetuning method to enhance document image understanding, particularly effective with limited training data, by treating the information extraction model as a policy network and optimizing it with reward functions.

Contribution

The paper presents a novel reinforcement learning finetuning approach for document image understanding, improving performance in low-data scenarios.

Findings

01

Consistent performance improvements on four datasets.

02

Enhanced extraction accuracy with limited labeled data.

03

Effective use of expert feedback for model improvement.

Abstract

Successful Artificial Intelligence systems often require numerous labeled data to extract information from document images. In this paper, we investigate the problem of improving the performance of Artificial Intelligence systems in understanding document images, especially in cases where training data is limited. We address the problem by proposing a novel finetuning method using reinforcement learning. Our approach treats the Information Extraction model as a policy network and uses policy gradient training to update the model to maximize combined reward functions that complement the traditional cross-entropy losses. Our experiments on four datasets using labels and expert feedback demonstrate that our finetuning mechanism consistently improves the performance of a state-of-the-art information extractor, especially in the small training data regime.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Handwritten Text Recognition Techniques · Explainable Artificial Intelligence (XAI)