4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs
Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning Li, Jianheng, Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang, Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang,, Christos Faloutsos, Zheng Zhang

TL;DR
This paper introduces 4DBInfer, a comprehensive benchmarking toolbox for graph-centric predictive modeling on relational databases, addressing the lack of public RDB benchmarks and exploring strategies for converting multi-table data into graph representations.
Contribution
It presents a new open-source toolbox, 4DBInfer, for benchmarking graph-based models on RDBs, and provides a diverse collection of datasets and tasks for evaluation.
Findings
Highlighting the importance of each exploration dimension in RDB modeling.
Naive approaches like simple table joins are limited in effectiveness.
Converting multi-table data into graphs with careful strategies improves predictive performance.
Abstract
Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and evaluation purposes. As a result, related model development thus far often defaults to tabular approaches trained on ubiquitous single-table benchmarks, or on the relational side, graph-based alternatives such as GNNs applied to a completely different set of graph datasets devoid of tabular characteristics. To more precisely target RDBs lying at the nexus of these two complementary regimes, we explore a broad class of baseline models predicated on: (i) converting multi-table datasets into graphs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Bioinformatics and Genomic Networks · Graph Theory and Algorithms
MethodsSparse Evolutionary Training
