$\delta$-STEAL: LLM Stealing Attack with Local Differential Privacy

Kieu Dang; Phung Lai; NhatHai Phan; Yelong Shen; Ruoming Jin; and Abdallah Khreishah

arXiv:2510.21946·cs.CR·October 28, 2025

$\delta$-STEAL: LLM Stealing Attack with Local Differential Privacy

Kieu Dang, Phung Lai, NhatHai Phan, Yelong Shen, Ruoming Jin, and Abdallah Khreishah

PDF

TL;DR

This paper introduces $oldsymbol{ ext{ extdelta}- ext{STEAL}}$, a novel LLM model stealing attack that uses local differential privacy to bypass watermarks while maintaining high attack success rates, threatening intellectual property protections.

Contribution

The paper presents $ ext{ extdelta}- ext{STEAL}$, a new attack method that obfuscates watermarks in LLM outputs using LDP, effectively bypassing watermark detectors without losing model utility.

Findings

01

Achieves up to 96.95% attack success rate

02

Effectively bypasses watermark detection methods

03

Balances attack success and model utility via noise scale

Abstract

Large language models (LLMs) demonstrate remarkable capabilities across various tasks. However, their deployment introduces significant risks related to intellectual property. In this context, we focus on model stealing attacks, where adversaries replicate the behaviors of these models to steal services. These attacks are highly relevant to proprietary LLMs and pose serious threats to revenue and financial stability. To mitigate these risks, the watermarking solution embeds imperceptible patterns in LLM outputs, enabling model traceability and intellectual property verification. In this paper, we study the vulnerability of LLM service providers by introducing $δ$ -STEAL, a novel model stealing attack that bypasses the service provider's watermark detectors while preserving the adversary's model utility. $δ$ -STEAL injects noise into the token embeddings of the adversary's model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.