From Schema to Signal: Retrieval-Augmented Modeling for Relational Data Analytics
Lingze Zeng, Shaofeng Cai, Changshuo Liu, Zhongle Xie, Yuncheng Wu, Beng Chin Ooi

TL;DR
This paper introduces Retrieval-Augmented Modeling (RAM), a novel framework that combines semantic attribute signals and graph structure to improve relational data analytics with deep learning.
Contribution
RAM is the first framework to integrate attribute semantics with graph structure using retrieval techniques for relational data modeling.
Findings
RAM outperforms existing methods on five real-world datasets.
The retrieval-based augmentations improve semantic relevance estimation.
The proposed architecture achieves state-of-the-art results in relational data tasks.
Abstract
Relational data stored in RDBMS is foundational to many real-world applications across domains such as e-commerce, finance, and sociality. While deep neural networks (DNNs) have achieved strong performance on tabular data with a single table, extending these models to relational databases is challenging due to the normalized multi-table structure and complex inter-table relationships. Existing approaches often rely strictly on schema-defined graphs, which overlook implicit semantic signals embedded in tuple attributes and suffer from rigid connectivity. In this work, we propose Retrieval-Augmented Modeling (RAM), a novel framework that combines graph structure with attribute semantics for relational data analytics. RAM treats tuple attributes as tokens and uses random walks to construct contextual documents, enabling the use of information retrieval techniques to estimate semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
