TableFormer: Table Structure Understanding with Transformers

Ahmed Nassar; Nikolaos Livathinos; Maksym Lysak; Peter Staar

arXiv:2203.01017·cs.CV·March 14, 2022

TableFormer: Table Structure Understanding with Transformers

Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, Peter Staar

PDF

3 Repos

TL;DR

This paper introduces a transformer-based model for table structure understanding that improves accuracy and handles diverse table formats without relying on OCR, advancing the state-of-the-art in table recognition.

Contribution

The paper proposes a novel table-structure identification model with a new object detection decoder and transformer decoders, outperforming previous models on complex table datasets.

Findings

01

Improved TEDS score from 91% to 98.5% on simple tables.

02

Enhanced performance on complex tables with TEDS from 88.7% to 95%.

03

Achieved better accuracy without OCR for non-English tables.

Abstract

Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables come in a large variety of shapes and sizes. Furthermore, they can have complex column/row-header configurations, multiline rows, different variety of separation lines, missing entries, etc. As such, the correct identification of the table-structure from an image is a non-trivial task. In this paper, we present a new table-structure identification model. The latter improves the latest end-to-end deep learning model (i.e. encoder-dual-decoder from PubTabNet) in two significant ways. First, we introduce a new object detection decoder for table-cells. In this way, we can obtain the content of the table-cells from programmatic PDF's directly from the PDF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory