SALT: Sales Autocompletion Linked Business Tables Dataset
Tassilo Klein, Clemens Biehl, Margarida Costa, Andre Sres, Jonas Kolk,, Johannes Hoffart

TL;DR
This paper introduces SALT, a dataset of linked enterprise tables from an ERP system, aimed at advancing research in table representation learning for real-world business applications.
Contribution
It provides the first curated dataset of extensive linked enterprise tables to facilitate research in structured data modeling and business intelligence.
Findings
Dataset enables new research in linked table modeling.
Supports development of models for enterprise data analysis.
Enhances applicability of AI in business contexts.
Abstract
Foundation models, particularly those that incorporate Transformer architectures, have demonstrated exceptional performance in domains such as natural language processing and image processing. Adapting these models to structured data, like tables, however, introduces significant challenges. These difficulties are even more pronounced when addressing multi-table data linked via foreign key, which is prevalent in the enterprise realm and crucial for empowering business use cases. Despite its substantial impact, research focusing on such linked business tables within enterprise settings remains a significantly important yet underexplored domain. To address this, we introduce a curated dataset sourced from an Enterprise Resource Planning (ERP) system, featuring extensive linked tables. This dataset is specifically designed to support research endeavors in table representation learning. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Mining Algorithms and Applications · Advanced Database Systems and Queries
MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Linear Layer · Adam · Residual Connection · Dropout · Multi-Head Attention · Position-Wise Feed-Forward Layer · Label Smoothing
