SAFENLIDB: A Privacy-Preserving Safety Alignment Framework for LLM-based Natural Language Database Interfaces

Ruiheng Liu; XiaoBing Chen; Jinyu Zhang; Qiongwen Zhang; Yu Zhang; Bailong Yang

arXiv:2511.06778·cs.CL·November 12, 2025

SAFENLIDB: A Privacy-Preserving Safety Alignment Framework for LLM-based Natural Language Database Interfaces

Ruiheng Liu, XiaoBing Chen, Jinyu Zhang, Qiongwen Zhang, Yu Zhang, Bailong Yang

PDF

Open Access

TL;DR

This paper introduces SafeNlidb, a framework that enhances privacy and security in LLM-based natural language database interfaces by enabling LLMs to generate secure SQL queries through automated reasoning and preference optimization.

Contribution

It proposes a novel privacy-security alignment framework with an automated data pipeline and optimization techniques, improving security without sacrificing utility in LLM-based NLIDB systems.

Findings

01

Outperforms larger LLMs and baselines in security measures

02

Achieves significant privacy improvements while maintaining high utility

03

Utilizes reasoning warm-up and preference optimization for secure SQL generation

Abstract

The rapid advancement of Large Language Models (LLMs) has driven significant progress in Natural Language Interface to Database (NLIDB). However, the widespread adoption of LLMs has raised critical privacy and security concerns. During interactions, LLMs may unintentionally expose confidential database contents or be manipulated by attackers to exfiltrate data through seemingly benign queries. While current efforts typically rely on rule-based heuristics or LLM agents to mitigate this leakage risk, these methods still struggle with complex inference-based attacks, suffer from high false positive rates, and often compromise the reliability of SQL queries. To address these challenges, we propose \textsc{SafeNlidb}, a novel privacy-security alignment framework for LLM-based NLIDB. The framework features an automated pipeline that generates hybrid chain-of-thought interaction data from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Web Application Security Vulnerabilities