Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs
Xiang Zheng, Yutao Wu, Hanxun Huang, Yige Li, Xingjun Ma, Bo Li, Yu-Gang Jiang, Cong Wang

TL;DR
This paper introduces JustAsk, a framework that autonomously uncovers hidden system prompts in large language model-based code agents, revealing significant security vulnerabilities without prior knowledge or privileged access.
Contribution
It presents a novel, self-evolving exploration method for prompt extraction that requires no handcrafted prompts or supervision, exposing critical security flaws in commercial LLM agents.
Findings
JustAsk achieves near-complete prompt recovery across 41 models.
System prompts are a widespread and intrinsic vulnerability.
The attack exploits generalization gaps and safety tensions in model design.
Abstract
Autonomous code agents built on large language models are reshaping software and AI development through tool use, long-horizon reasoning, and self-directed interaction. However, this autonomy introduces a previously unrecognized security risk: agentic interaction fundamentally expands the LLM attack surface, enabling systematic probing and recovery of hidden system prompts that guide model behavior. We identify system prompt extraction as an emergent vulnerability intrinsic to code agents and present \textbf{\textsc{JustAsk}}, a self-evolving framework that autonomously discovers effective extraction strategies through interaction alone. Unlike prior prompt-engineering or dataset-based attacks, \textsc{JustAsk} requires no handcrafted prompts, labeled supervision, or privileged access beyond standard user interaction. It formulates extraction as an online exploration problem, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Information and Cyber Security
