CBR-to-SQL: Rethinking Retrieval-based Text-to-SQL using Case-based Reasoning in the Healthcare Domain
Hung Nguyen, Hans Moen, Pekka Marttinen

TL;DR
This paper introduces CBR-to-SQL, a novel framework for translating natural language questions into SQL in healthcare, improving robustness and sample efficiency over standard retrieval-augmented methods.
Contribution
It rethinks retrieval in Text-to-SQL by decomposing it into two stages inspired by Case-based Reasoning, enhancing performance in medical domains.
Findings
Achieves competitive accuracy on clinical benchmarks.
Demonstrates higher robustness under data scarcity.
Shows improved sample efficiency compared to standard RAG.
Abstract
Extracting insights from Electronic Health Record (EHR) databases often requires SQL expertise, creating a barrier for clinical decision-making and research. A promising approach is to use Large Language Models (LLMs) to translate natural language questions into SQL through Retrieval-Augmented Generation (RAG), where relevant question-SQL examples are retrieved to generate new queries via few-shot learning. However, adapting this method to the medical domain is non-trivial, as effective retrieval requires examples that align with both the logical structure of the question and its referenced entities (e.g., drug names, procedure titles). Standard single-step RAG struggles to optimize both aspects simultaneously and often relies on near-exact matches to generalize effectively. This issue is especially severe in healthcare, as questions often contain noisy and inconsistent medical jargon.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
