TL;DR
DenTab is a new dataset with 2,000 real-world dental estimate tables designed to evaluate table recognition and visual question answering, highlighting current models' limitations and proposing a deterministic arithmetic reasoning pipeline.
Contribution
Introduces DenTab, a challenging dataset for table recognition and visual QA in noisy real-world settings, and proposes the Table Router Pipeline for reliable arithmetic reasoning.
Findings
Strong structure recovery does not ensure accurate multi-step reasoning.
Models struggle with arithmetic and consistency questions even with ground-truth tables.
The Table Router Pipeline improves arithmetic reliability without additional training.
Abstract
Tables condense key transactional and administrative information into compact layouts, but practical extraction requires more than text recognition: systems must also recover structure (rows, columns, merged cells, headers) and interpret roles such as line items, subtotals, and totals under common capture artifacts. Many existing resources for table structure recognition and TableVQA are built from clean digital-born sources or rendered tables, and therefore only partially reflect noisy administrative conditions. We introduce DenTab, a dataset of 2{,}000 cropped table images from dental estimates with high-quality HTML annotations, enabling evaluation of table recognition (TR) and table visual question answering (TableVQA) on the same inputs. DenTab includes 2{,}208 questions across eleven categories spanning retrieval, aggregation, and logic/consistency checks. We benchmark 16…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
