BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding
Timo I. Denk, Christian Reisswig

TL;DR
BERTgrid is a novel method that encodes document layout and semantics into a grid of contextualized embeddings from BERT, improving document understanding tasks like invoice field extraction.
Contribution
It introduces BERTgrid, a new grid-based embedding representation that captures spatial and semantic information for document analysis.
Findings
Effective in extracting invoice fields
Outperforms previous methods in document segmentation
Demonstrates the importance of spatial context in document understanding
Abstract
For understanding generic documents, information like font sizes, column layout, and generally the positioning of words may carry semantic information that is crucial for solving a downstream document intelligence task. Our novel BERTgrid, which is based on Chargrid by Katti et al. (2018), represents a document as a grid of contextualized word piece embedding vectors, thereby making its spatial structure and semantics accessible to the processing neural network. The contextualized embedding vectors are retrieved from a BERT language model. We use BERTgrid in combination with a fully convolutional network on a semantic instance segmentation task for extracting fields from invoices. We demonstrate its performance on tabulated line item and document header field extraction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Natural Language Processing Techniques
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
