ROPE: Reading Order Equivariant Positional Encoding for Graph-based   Document Information Extraction

Chen-Yu Lee; Chun-Liang Li; Chu Wang; Renshen Wang; Yasuhisa Fujii,; Siyang Qin; Ashok Popat; Tomas Pfister

arXiv:2106.10786·cs.CL·June 22, 2021

ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction

Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii,, Siyang Qin, Ashok Popat, Tomas Pfister

PDF

Open Access

TL;DR

This paper introduces ROPE, a novel positional encoding method that enhances graph neural networks' ability to understand reading order in document images, significantly improving entity extraction accuracy.

Contribution

ROPE provides a new reading order-aware positional encoding for GCNs, improving document information extraction by capturing sequential word presentation.

Findings

01

ROPE improves GCN performance by up to 8.4% F1-score.

02

ROPE effectively captures reading order in document graphs.

03

Enhanced entity extraction on FUNSD and payment datasets.

Abstract

Natural reading orders of words are crucial for information extraction from form-like documents. Despite recent advances in Graph Convolutional Networks (GCNs) on modeling spatial layout patterns of documents, they have limited ability to capture reading orders of given word-level node representations in a graph. We propose Reading Order Equivariant Positional Encoding (ROPE), a new positional encoding technique designed to apprehend the sequential presentation of words in documents. ROPE generates unique reading order codes for neighboring words relative to the target word given a word-level graph connectivity. We study two fundamental document entity extraction tasks including word labeling and word grouping on the public FUNSD dataset and a large-scale payment dataset. We show that ROPE consistently improves existing GCNs with a margin up to 8.4% F1-score.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks

MethodsGraph Convolutional Networks