Arming Data Agents with Tribal Knowledge

Shubham Agarwal; Asim Biswal; Sepanta Zeighami; Alvin Cheung; Joseph Gonzalez; Aditya G. Parameswaran

arXiv:2602.13521·cs.DB·February 19, 2026

Arming Data Agents with Tribal Knowledge

Shubham Agarwal, Asim Biswal, Sepanta Zeighami, Alvin Cheung, Joseph Gonzalez, Aditya G. Parameswaran

PDF

Open Access

TL;DR

This paper introduces Tk-Boost, a framework that enhances NL2SQL agents by providing tribal knowledge to correct misconceptions, significantly improving their accuracy on real-world databases.

Contribution

Tk-Boost is a novel bolt-on framework that identifies and addresses NL2SQL agent misconceptions through experience-based tribal knowledge augmentation.

Findings

01

Improves NL2SQL accuracy by up to 16.9% on Spider 2.0

02

Enhances performance by up to 13.7% on BIRD

03

Effective across various NL2SQL agents

Abstract

Natural language to SQL (NL2SQL) translation enables non-expert users to query relational databases through natural language. Recently, NL2SQL agents, powered by the reasoning capabilities of Large Language Models (LLMs), have significantly advanced NL2SQL translation. Nonetheless, NL2SQL agents still make mistakes when faced with large-scale real-world databases because they lack knowledge of how to correctly leverage the underlying data (e.g., knowledge about the intent of each column) and form misconceptions about the data when querying it, leading to errors. Prior work has studied generating facts about the database to provide more context to NL2SQL agents, but such approaches simply restate database contents without addressing the agent's misconceptions. In this paper, we propose Tk-Boost, a bolt-on framework for augmenting any NL2SQL agent with tribal knowledge: knowledge that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Topic Modeling · Semantic Web and Ontologies