PROMPTMINER: Black-Box Prompt Stealing against Text-to-Image Generative Models via Reinforcement Learning and Fuzz Optimization

Mingzhe Li; Renhao Zhang; Zhiyang Wen; Siqi Pan; Bruno Castro da Silva; Juan Zhai; Shiqing Ma

arXiv:2511.22119·cs.CV·December 1, 2025

PROMPTMINER: Black-Box Prompt Stealing against Text-to-Image Generative Models via Reinforcement Learning and Fuzz Optimization

Mingzhe Li, Renhao Zhang, Zhiyang Wen, Siqi Pan, Bruno Castro da Silva, Juan Zhai, Shiqing Ma

PDF

Open Access

TL;DR

PROMPTMINER is a novel black-box framework that effectively recovers textual prompts from generated images using reinforcement learning and fuzz optimization, enhancing security analysis and intellectual property protection.

Contribution

It introduces a two-phase prompt stealing method that does not require white-box access or large datasets, improving practicality and robustness over existing approaches.

Findings

01

Achieves CLIP similarity up to 0.958, surpassing baselines.

02

Outperforms strongest baseline by 7.5% in CLIP similarity on in-the-wild images.

03

Maintains strong performance under defensive perturbations.

Abstract

Text-to-image (T2I) generative models such as Stable Diffusion and FLUX can synthesize realistic, high-quality images directly from textual prompts. The resulting image quality depends critically on well-crafted prompts that specify both subjects and stylistic modifiers, which have become valuable digital assets. However, the rising value and ubiquity of high-quality prompts expose them to security and intellectual-property risks. One key threat is the prompt stealing attack, i.e., the task of recovering the textual prompt that generated a given image. Prompt stealing enables unauthorized extraction and reuse of carefully engineered prompts, yet it can also support beneficial applications such as data attribution, model provenance analysis, and watermarking validation. Existing approaches often assume white-box gradient access, require large-scale labeled datasets for supervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Cell Image Analysis Techniques