LatentRefusal: Latent-Signal Refusal for Unanswerable Text-to-SQL Queries
Xuancheng Ren, Shijing Hu, Zhihui Lu, Jiangqi Huang, Qiang Duan

TL;DR
LatentRefusal introduces a novel latent-signal refusal mechanism for text-to-SQL systems, effectively identifying unanswerable queries by analyzing intermediate model activations, thereby enhancing safety with minimal overhead.
Contribution
It formalizes answerability gating and proposes a lightweight probing architecture to detect unanswerable queries from hidden model states, improving safety in text-to-SQL systems.
Findings
Achieves an average F1 of 88.5% across four benchmarks.
Adds only about 2 milliseconds of probe overhead.
Effectively detects unanswerable queries with high accuracy.
Abstract
In LLM-based text-to-SQL systems, unanswerable and underspecified user queries may generate not only incorrect text but also executable programs that yield misleading results or violate safety constraints, posing a major barrier to safe deployment. Existing refusal strategies for such queries either rely on output-level instruction following, which is brittle due to model hallucinations, or estimate output uncertainty, which adds complexity and overhead. To address this challenge, we formalize safe refusal in text-to-SQL systems as an answerability-gating problem and propose LatentRefusal, a latent-signal refusal mechanism that predicts query answerability from intermediate hidden activations of a large language model. We introduce the Tri-Residual Gated Encoder, a lightweight probing architecture, to suppress schema noise and amplify sparse, localized cues of question-schema mismatch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
