Relational In-Context Learning via Synthetic Pre-training with Structural Prior
Yanbo Wang, Jiaxuan You, Chuan Shi, Muhan Zhang

TL;DR
This paper introduces RDB-PFN, a synthetic-data trained relational foundation model that excels at in-context learning for real-world relational tasks, overcoming data scarcity issues.
Contribution
The authors develop RDB-PFN, the first relational foundation model trained solely on synthetic data, enabling instant adaptation to new databases through in-context learning.
Findings
RDB-PFN achieves strong few-shot performance on 19 real-world tasks.
It outperforms graph-based and single-table baselines with the same input format.
The model uses a lightweight architecture and offers fast inference.
Abstract
Relational Databases (RDBs) are the backbone of modern business, yet they lack foundation models comparable to those in text or vision. A key obstacle is that high-quality RDBs are private, scarce and structurally heterogeneous, making internet-scale pre-training infeasible. To overcome this data scarcity, We introduce , the first relational foundation model trained purely via . Inspired by Prior-Data Fitted Networks (PFNs) where synthetic data generated from Structural Causal Models (SCMs) enables reasoning on single tables, we design a to create an infinite stream of diverse RDBs from scratch. Pre-training on synthetic single-table and relational tasks, RDB-PFN learns to adapt to any new database instantly via genuine . Experiments verify RDB-PFN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
