Text2SQL is Not Enough: Unifying AI and Databases with TAG
Asim Biswal, Liana Patel, Siddarth Jha, Amog Kamsetty, Shu Liu, Joseph, E. Gonzalez, Carlos Guestrin, Matei Zaharia

TL;DR
This paper introduces TAG, a unified approach to answer natural language questions over databases by combining language models and data systems, revealing current methods' limitations and proposing new research directions.
Contribution
The paper proposes Table-Augmented Generation (TAG), a novel paradigm unifying AI and databases, and develops benchmarks to evaluate its effectiveness.
Findings
Standard methods answer less than 20% of queries correctly.
Existing approaches focus on limited question types, missing broader interactions.
The paper provides a new benchmark for future research.
Abstract
AI systems that serve natural language questions over databases promise to unlock tremendous value. Such systems would allow users to leverage the powerful reasoning and knowledge capabilities of language models (LMs) alongside the scalable computational power of data management systems. These combined capabilities would empower users to ask arbitrary natural language questions over custom data sources. However, existing methods and benchmarks insufficiently explore this setting. Text2SQL methods focus solely on natural language questions that can be expressed in relational algebra, representing a small subset of the questions real users wish to ask. Likewise, Retrieval-Augmented Generation (RAG) considers the limited subset of queries that can be answered with point lookups to one or a few data records within the database. We propose Table-Augmented Generation (TAG), a unified and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsFocus
