RelBench v2: A Large-Scale Benchmark and Repository for Relational Data
Justin Gu, Rishabh Ranjan, Charilaos Kanatsoulis, Haiming Tang, Martin Jurkovic, Valter Hudovernik, Mark Znidar, Pranshu Chaturvedi, Parth Shroff, Fengyu Li, Jure Leskovec

TL;DR
RelBench v2 introduces an extensive, scalable benchmark with new datasets and tasks for relational deep learning, enabling systematic evaluation of models on large, complex relational data.
Contribution
It expands the RelBench benchmark with large-scale datasets, new autocomplete tasks, and integration of external evaluation frameworks for comprehensive RDL assessment.
Findings
RDL models outperform single-table baselines on various tasks.
New datasets increase benchmark size to over 22 million rows.
Autocomplete tasks challenge models to infer missing relational data.
Abstract
Relational deep learning (RDL) has emerged as a powerful paradigm for learning directly on relational databases by modeling entities and their relationships across multiple interconnected tables. As this paradigm evolves toward larger models and relational foundation models, scalable and realistic benchmarks are essential for enabling systematic evaluation and progress. In this paper, we introduce RelBench v2, a major expansion of the RelBench benchmark for RDL. RelBench v2 adds four large-scale relational datasets spanning scholarly publications, enterprise resource planning, consumer platforms, and clinical records, increasing the benchmark to 11 datasets comprising over 22 million rows across 29 tables. We further introduce autocomplete tasks, a new class of predictive objectives that require models to infer missing attribute values directly within relational tables while respecting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
