TL;DR
The paper introduces SQL Query Engine, an open-source system that translates natural language to PostgreSQL queries using a self-healing, two-stage LLM pipeline with error diagnosis and recovery.
Contribution
It presents a novel self-healing LLM pipeline with error diagnosis, response parsing, and performance improvements for natural language to SQL translation.
Findings
Self-healing loop improves accuracy by up to 9.3 percentage points.
Pipeline achieves 49.0% execution accuracy on real-world benchmarks.
The system supports multiple LLM backends and maintains zero regressions.
Abstract
We present SQL Query Engine, an open-source, self-hosted service that translates natural language questions into validated PostgreSQL queries through a two-stage LLM pipeline. The first stage performs automatic schema introspection and SQL generation; a multi-strategy response parser extracts SQL from any LLM output format (JSON, code blocks, or raw text) without requiring structured output APIs. The second stage executes the query against PostgreSQL and, upon failure or empty results, enters an iterative self-healing loop in which the LLM diagnoses the error using full SQLSTATE codes and PostgreSQL diagnostic messages. Two mechanisms prevent regressions: early-accept returns successful queries immediately without LLM re-evaluation, and best-result tracking preserves the best partial result across retries. Schema context is cached per session in Redis, progress events stream via Redis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
