Tackling prediction tasks in relational databases with LLMs
Marek Wydmuch, {\L}ukasz Borchmann, Filip Grali\'nski

TL;DR
This paper explores the use of large language models for predictive tasks in relational databases, showing they perform competitively despite complex data structures, thus establishing a new baseline for ML in this domain.
Contribution
It demonstrates that LLMs can effectively handle relational database tasks, challenging assumptions about their limitations and providing a new baseline for future research.
Findings
LLMs achieve competitive results on RelBench tasks.
Relational database complexity does not hinder LLM performance significantly.
LLMs are a promising baseline for machine learning in relational databases.
Abstract
Though large language models (LLMs) have demonstrated exceptional performance across numerous problems, their application to predictive tasks in relational databases remains largely unexplored. In this work, we address the notion that LLMs cannot yield satisfactory results on relational databases due to their interconnected tables, complex relationships, and heterogeneous data types. Using the recently introduced RelBench benchmark, we demonstrate that even a straightforward application of LLMs achieves competitive performance on these tasks. These findings establish LLMs as a promising new baseline for ML on relational databases and encourage further research in this direction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Data Mining Algorithms and Applications
