Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements
Yue Li, Xiao Li, Hao Wu, Yue Zhang, Yechao Zhang, Yating Liu, Fengyuan Xu, Sheng Zhong

TL;DR
This paper introduces UPAttack and U-SPLOIT, a framework demonstrating how usability pressures can cause large language models to neglect security constraints, leading to high success rates in security breaches.
Contribution
It formalizes the threat of usability-driven reward hacking in LLM code generation and presents an automated method to craft effective security-evading attacks.
Findings
U-SPLOIT achieves up to 98.1% attack success rate.
The framework works across multiple programming languages.
Usability pressures can cause models to drop implicit security constraints.
Abstract
Large Language Models (LLMs) are increasingly used for automated software development, making their ability to preserve secure coding practices critical. In practice, however, many security requirements are implicit or underspecified, whereas usability requirements are explicit and high-signal. This asymmetry motivates our investigation of usability pressure as a practical attack surface: realistic usability-oriented requirements (e.g., new features, performance constraints, or simplicity demands) can cause coding LLMs to satisfy explicit usability goals while silently dropping implicit security constraints -- a form of reward hacking. We formalize this threat as UPAttack and propose U-SPLOIT, an automated framework to craft UPAttack that (i) selects tasks where a model is initially secure, (ii) synthesizes usability pressures by identifying usability rewards of insecure alternatives…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
