Named Entity Recognition in Industrial Tables using Tabular Language Models
Aneta Koleva, Martin Ringsquandl, Mark Buckley, Rakebul Hasan and, Volker Tresp

TL;DR
This paper explores applying transformer-based models to industrial table data for Named Entity Recognition, addressing challenges like limited labeled data with a novel augmentation strategy, and demonstrating the importance of tabular structure for model performance.
Contribution
It introduces a domain-specific data augmentation method and highlights the significance of tabular inductive bias in transformer models for industrial NER tasks.
Findings
Table transformers outperform baselines in industrial NER.
Data augmentation significantly improves low-resource performance.
Tabular structure is crucial for model convergence.
Abstract
Specialized transformer-based models for encoding tabular data have gained interest in academia. Although tabular data is omnipresent in industry, applications of table transformers are still missing. In this paper, we study how these models can be applied to an industrial Named Entity Recognition (NER) problem where the entities are mentioned in tabular-structured spreadsheets. The highly technical nature of spreadsheets as well as the lack of labeled data present major challenges for fine-tuning transformer-based models. Therefore, we develop a dedicated table data augmentation strategy based on available domain-specific knowledge graphs. We show that this boosts performance in our low-resource scenario considerably. Further, we investigate the benefits of tabular structure as inductive bias compared to tables as linearized sequences. Our experiments confirm that a table transformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Information Retrieval and Search Behavior
