Arming Data Agents with Tribal Knowledge
Shubham Agarwal, Asim Biswal, Sepanta Zeighami, Alvin Cheung, Joseph Gonzalez, Aditya G. Parameswaran

TL;DR
This paper introduces Tk-Boost, a framework that enhances NL2SQL agents by providing tribal knowledge to correct misconceptions, significantly improving their accuracy on real-world databases.
Contribution
Tk-Boost is a novel bolt-on framework that identifies and addresses NL2SQL agent misconceptions through experience-based tribal knowledge augmentation.
Findings
Improves NL2SQL accuracy by up to 16.9% on Spider 2.0
Enhances performance by up to 13.7% on BIRD
Effective across various NL2SQL agents
Abstract
Natural language to SQL (NL2SQL) translation enables non-expert users to query relational databases through natural language. Recently, NL2SQL agents, powered by the reasoning capabilities of Large Language Models (LLMs), have significantly advanced NL2SQL translation. Nonetheless, NL2SQL agents still make mistakes when faced with large-scale real-world databases because they lack knowledge of how to correctly leverage the underlying data (e.g., knowledge about the intent of each column) and form misconceptions about the data when querying it, leading to errors. Prior work has studied generating facts about the database to provide more context to NL2SQL agents, but such approaches simply restate database contents without addressing the agent's misconceptions. In this paper, we propose Tk-Boost, a bolt-on framework for augmenting any NL2SQL agent with tribal knowledge: knowledge that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Topic Modeling · Semantic Web and Ontologies
