Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang; Shibo Cui; Baojun Liu; Jingkai Yu; Min Zhang; Fan Shi; Han Zheng

arXiv:2511.17874·cs.CR·December 17, 2025

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang, Shibo Cui, Baojun Liu, Jingkai Yu, Min Zhang, Fan Shi, Han Zheng

PDF

Open Access 1 Models

TL;DR

This paper explores security risks in LLM applications caused by ambiguous capability boundaries, introducing a framework to evaluate these risks and highlighting the importance of prompt design for robustness.

Contribution

It defines the LLM app capability space, uncovers new risks like capability downgrade and upgrade, and provides an evaluation framework based on cross-platform analysis.

Findings

01

89.45% of analyzed applications are potentially affected by risks

02

17 applications executed malicious tasks without adversarial rewriting

03

Prompt quality positively correlates with application robustness

Abstract

LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional application development. While the increasing prevalence of LLM-powered applications provides users with unprecedented convenience, it also brings forth new security challenges. For such an emerging ecosystem, the security community lacks sufficient understanding of the LLM application ecosystem, especially regarding the capability boundaries of the applications themselves. In this paper, we systematically analyzed the new development paradigm and defined the concept of the LLM app capability space. We also uncovered potential new risks beyond jailbreak that arise from ambiguous capability boundaries in real-world scenarios, namely, capability downgrade and upgrade. To evaluate the impact of these risks, we designed and implemented an LLM…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
AbdulElahGwaith/AI-Infra-Guard
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Web Application Security Vulnerabilities · Information and Cyber Security