TableSense: Spreadsheet Table Detection with Convolutional Neural Networks
Haoyu Dong, Shijie Liu, Shi Han, Zhouyu Fu, Dongmei Zhang

TL;DR
TableSense is an innovative CNN-based framework that accurately detects spreadsheet tables by leveraging cell features, domain-specific boundary detection, and active learning, significantly outperforming existing methods.
Contribution
The paper introduces a novel end-to-end CNN framework with specialized cell features and active learning for efficient, accurate spreadsheet table detection.
Findings
Achieved 91.3% recall and 86.5% precision in table detection
Built a training dataset with 22,176 tables from 10,220 sheets
Significantly outperforms existing detection algorithms and CNN models
Abstract
Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the detection task is challenged by the diversity of table structures and table layouts on the spreadsheet. Considering the analogy between a cell matrix as spreadsheet and a pixel matrix as image, and encouraged by the successful application of Convolutional Neural Networks (CNN) in computer vision, we have developed TableSense, a novel end-to-end framework for spreadsheet table detection. First, we devise an effective cell featurization scheme to better leverage the rich information in each cell; second, we develop an enhanced convolutional neural network model for table detection to meet the domain-specific requirement on precise table boundary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
