SAFENLIDB: A Privacy-Preserving Safety Alignment Framework for LLM-based Natural Language Database Interfaces
Ruiheng Liu, XiaoBing Chen, Jinyu Zhang, Qiongwen Zhang, Yu Zhang, Bailong Yang

TL;DR
This paper introduces SafeNlidb, a framework that enhances privacy and security in LLM-based natural language database interfaces by enabling LLMs to generate secure SQL queries through automated reasoning and preference optimization.
Contribution
It proposes a novel privacy-security alignment framework with an automated data pipeline and optimization techniques, improving security without sacrificing utility in LLM-based NLIDB systems.
Findings
Outperforms larger LLMs and baselines in security measures
Achieves significant privacy improvements while maintaining high utility
Utilizes reasoning warm-up and preference optimization for secure SQL generation
Abstract
The rapid advancement of Large Language Models (LLMs) has driven significant progress in Natural Language Interface to Database (NLIDB). However, the widespread adoption of LLMs has raised critical privacy and security concerns. During interactions, LLMs may unintentionally expose confidential database contents or be manipulated by attackers to exfiltrate data through seemingly benign queries. While current efforts typically rely on rule-based heuristics or LLM agents to mitigate this leakage risk, these methods still struggle with complex inference-based attacks, suffer from high false positive rates, and often compromise the reliability of SQL queries. To address these challenges, we propose \textsc{SafeNlidb}, a novel privacy-security alignment framework for LLM-based NLIDB. The framework features an automated pipeline that generates hybrid chain-of-thought interaction data from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Web Application Security Vulnerabilities
