PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
Kaiwen Wei, Jie Yao, Jingyuan Zhang, Yangyang Kang, Fubang Zhao,, Yating Zhang, Changlong Sun, Xin Jin, Xin Zhang

TL;DR
This paper introduces PPN, a parallel pointer-based network, and a large-scale dataset CLEX for improved key information extraction from complex, visually rich documents, addressing dataset limitations and error propagation issues.
Contribution
The paper presents a new large-scale dataset CLEX and a novel end-to-end PPN model capable of zero-shot and few-shot extraction, enhancing efficiency and accuracy.
Findings
PPN outperforms existing methods on CLEX.
PPN achieves faster inference speed.
CLEX dataset contains 5,860 images with 1,162 categories.
Abstract
Key Information Extraction (KIE) is a challenging multimodal task that aims to extract structured value semantic entities from visually rich documents. Although significant progress has been made, there are still two major challenges that need to be addressed. Firstly, the layout of existing datasets is relatively fixed and limited in the number of semantic entity categories, creating a significant gap between these datasets and the complex real-world scenarios. Secondly, existing methods follow a two-stage pipeline strategy, which may lead to the error propagation problem. Additionally, they are difficult to apply in situations where unseen semantic entity categories emerge. To address the first challenge, we propose a new large-scale human-annotated dataset named Complex Layout form for key information EXtraction (CLEX), which consists of 5,860 images with 1,162 semantic entity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Image Retrieval and Classification Techniques
