Image-based table recognition: data, model, and evaluation
Xu Zhong, Elaheh ShafieiBavani, Antonio Jimeno Yepes

TL;DR
This paper introduces PubTabNet, a large dataset for image-based table recognition, and proposes a novel attention-based model that significantly improves recognition accuracy of complex tables from images.
Contribution
The paper presents the largest publicly available dataset for table recognition and a new attention-based encoder-dual-decoder model with a novel TEDS metric for better evaluation.
Findings
The EDD model outperforms previous methods by 9.7% in TEDS score.
PubTabNet contains 568k table images with HTML annotations.
The TEDS metric better captures multi-hop cell misalignments and OCR errors.
Abstract
Important information that relates to a specific topic in a document is often organized in tabular format to assist readers with information retrieval and comparison, which may be difficult to provide in natural language. However, tabular data in unstructured digital documents, e.g., Portable Document Format (PDF) and images, are difficult to parse into structured machine-readable format, due to complexity and diversity in their structure and style. To facilitate image-based table recognition with deep learning, we develop the largest publicly available table recognition dataset PubTabNet (https://github.com/ibm-aur-nlp/PubTabNet), containing 568k table images with corresponding structured HTML representation. PubTabNet is automatically generated by matching the XML and PDF representations of the scientific articles in PubMed Central Open Access Subset (PMCOA). We also propose a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗deepdoctection/d2_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_c_inference_onlymodel· ♡ 2♡ 2
- 🤗deepdoctection/d2_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_rc_inference_onlymodel· ♡ 1♡ 1
- 🤗deepdoctection/tp_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_cmodel
- 🤗deepdoctection/tp_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_c_inference_onlymodel
- 🤗deepdoctection/tp_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_rcmodel
- 🤗deepdoctection/tp_casc_rcnn_X_32xd4_50_FPN_GN_2FC_pubtabnet_rc_inference_onlymodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Currency Recognition and Detection
