# A robust natural language text-to-SQL generation framework with dynamic strategies based on LLMs

**Authors:** Xiaodong Su, Yang Gu, Peng Wang, Wei Gu, Lincheng Qi, Jingwei He

PMC · DOI: 10.1038/s41598-026-39128-9 · Scientific Reports · 2026-02-09

## TL;DR

This paper introduces TriSQL, a new framework that improves the accuracy of converting natural language questions into SQL queries by adapting to question complexity.

## Contribution

The novel three-stage framework dynamically adjusts strategies based on question complexity using a schema selector, SQL generator, and SQL refiner.

## Key findings

- TriSQL outperforms existing LLM-based methods on the Spider benchmark.
- The framework provides high efficiency and strong robustness in SQL generation.
- Dynamic strategy adjustment improves accuracy for complex questions.

## Abstract

Natural language text-to-SQL generation (Text2SQL) aims to translate natural language questions into executable SQL queries. Although the emergence of large language models (LLMs) has led to significant advancements in this field, their performance degrades sharply with question complexity increases. A key limitation of current LLM-based methods lies in their uniform generation strategies, which fail to adapt dynamically to varying question complexity. To address this issue, we propose TriSQL, a novel three-stage framework designed to analyze question complexity and generate accurate and executable SQL. First, a Question-Guided Schema Selector is conceived to get the most relevant schema to the question using cross attention. Second, a Structure-Aware SQL Generator takes both the question and the selected schema as input, employing hierarchical decoding to generate a syntactically valid initial SQL. Finally, a Complexity-Aware SQL Refiner is designed with LLM to dynamically adjust strategies corresponding to the complexity of question and initial SQL, ensuring that the final generated SQL is both accurate and executable. Experimental results on the Spider benchmark and its variants show that TriSQL achieves state-of-the-art execution accuracy, surpasses existing LLM-based methods, and provides both high efficiency and strong robustness.

## Full-text entities

- **Diseases:** COLLECTION (MESH:D002292), LLMs (MESH:D007806), CONDITION (MESH:D020763)
- **Chemicals:** Cypher (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** 103 DELETE, 274 INSERT

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12953869/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12953869/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12953869/full.md

---
Source: https://tomesphere.com/paper/PMC12953869