Financial Table Extraction in Image Documents
William Watson, Bo Liu

TL;DR
This paper introduces an end-to-end deep learning pipeline for accurately extracting and transcribing financial tables from images, preserving spatial relationships, addressing a longstanding challenge in financial document analysis.
Contribution
It presents a novel integrated approach combining image segmentation, OCR, and sequence modeling for high-fidelity financial table extraction from images.
Findings
High accuracy in table detection and extraction
Preserves original spatial relations effectively
Applicable to various financial document formats
Abstract
Table extraction has long been a pervasive problem in financial services. This is more challenging in the image domain, where content is locked behind cumbersome pixel format. Luckily, advances in deep learning for image segmentation, OCR, and sequence modeling provides the necessary heavy lifting to achieve impressive results. This paper presents an end-to-end pipeline for identifying, extracting and transcribing tabular content in image documents, while retaining the original spatial relations with high fidelity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
