You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL
Hideo Kobayashi, Wuwei Lan, Peng Shi, Shuaichen Chang, Jiang Guo,, Henghui Zhu, Zhiguo Wang, Patrick Ng

TL;DR
YORO introduces a method that internalizes database knowledge into a text-to-SQL model during training, greatly reducing inference costs and maintaining competitive performance, especially on large databases and complex queries.
Contribution
YORO is the first approach to internalize database knowledge into the model, eliminating repeated schema encoding during inference and improving efficiency and performance.
Findings
Reduces input token length by 66%-98%
Achieves competitive results on three benchmarks
Outperforms traditional methods on large databases
Abstract
While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO's competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Scientific Computing and Data Management
