Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers
Lya Hulliyyatus Suadaa, Hidetaka Kamigaito, Manabu Okumura, Hiroya, Takamura

TL;DR
This paper introduces a new task of metric-type identification in multi-level header numerical tables from scientific papers, along with a dataset and neural models that improve understanding of table metrics.
Contribution
It presents the first dataset and neural models for metric-type identification in multi-level header tables, combining classification and generation techniques.
Findings
Joint models effectively identify metric-types within and outside headers.
BERT-based models outperform pointer-generator models.
The approach enhances automatic table understanding in scientific literature.
Abstract
Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. We introduce a new information extraction task, metric-type identification from multi-level header numerical tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types. We then propose two joint-learning neural classification and generation schemes featuring pointer-generator-based and BERT-based models. Our results show that the joint models can handle both in-header and out-of-header metric-type identification problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Advanced Text Analysis Techniques · Data Quality and Management
