CARTE: Pretraining and Transfer for Tabular Learning
Myung Jun Kim, L\'eo Grinsztajn, and Ga\"el Varoquaux

TL;DR
CARTE introduces a neural architecture for tabular data that enables pretraining and transfer learning without requiring matched entries or schemas, outperforming traditional models and facilitating joint learning across unmatched tables.
Contribution
The paper presents CARTE, a novel graph-based neural model that handles unmatched tabular data for pretraining and transfer learning, overcoming key challenges in schema and entity matching.
Findings
CARTE outperforms traditional tree-based models in benchmarks.
It enables joint learning across unmatched tables.
Pretraining improves learning efficiency on tabular data.
Abstract
Pretrained deep-learning models are the go-to solution for images or text. However, for tabular data the standard is still to train tree-based models. Indeed, transfer learning on tables hits the challenge of data integration: finding correspondences, correspondences in the entries (entity matching) where different words may denote the same entity, correspondences across columns (schema matching), which may come in different orders, names... We propose a neural architecture that does not need such correspondences. As a result, we can pretrain it on background data that has not been matched. The architecture -- CARTE for Context Aware Representation of Table Entries -- uses a graph representation of tabular (or relational) data to process tables with different columns, string embedding of entries and columns names to model an open vocabulary, and a graph-attentional network to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Topic Modeling · Advanced Graph Neural Networks
MethodsSparse Evolutionary Training · Attentive Walk-Aggregating Graph Neural Network
