ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Chen-Yu Lee, Chun-Liang Li, Chu Wang, Renshen Wang, Yasuhisa Fujii,, Siyang Qin, Ashok Popat, Tomas Pfister

TL;DR
This paper introduces ROPE, a novel positional encoding method that enhances graph neural networks' ability to understand reading order in document images, significantly improving entity extraction accuracy.
Contribution
ROPE provides a new reading order-aware positional encoding for GCNs, improving document information extraction by capturing sequential word presentation.
Findings
ROPE improves GCN performance by up to 8.4% F1-score.
ROPE effectively captures reading order in document graphs.
Enhanced entity extraction on FUNSD and payment datasets.
Abstract
Natural reading orders of words are crucial for information extraction from form-like documents. Despite recent advances in Graph Convolutional Networks (GCNs) on modeling spatial layout patterns of documents, they have limited ability to capture reading orders of given word-level node representations in a graph. We propose Reading Order Equivariant Positional Encoding (ROPE), a new positional encoding technique designed to apprehend the sequential presentation of words in documents. ROPE generates unique reading order codes for neighboring words relative to the target word given a word-level graph connectivity. We study two fundamental document entity extraction tasks including word labeling and word grouping on the public FUNSD dataset and a large-scale payment dataset. We show that ROPE consistently improves existing GCNs with a margin up to 8.4% F1-score.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
MethodsGraph Convolutional Networks
