Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak
Haoran Gu, Handing Wang, Yi Mei, Mengjie Zhang, Yaochu Jin

TL;DR
This paper uncovers a significant safety vulnerability in large language models related to malicious requests for intelligent optimization algorithms, demonstrating high susceptibility and limited defenses, thus emphasizing the need for improved safety measures.
Contribution
It introduces MalOptBench, a benchmark for malicious optimization requests, and MOBjailbreak, a tailored jailbreak method, revealing vulnerabilities in 13 mainstream LLMs.
Findings
Most models are highly susceptible with an attack success rate of 83.59%.
Existing defenses are only marginally effective against MOBjailbreak.
Models show near-complete failure under the proposed jailbreak method.
Abstract
The widespread deployment of large language models (LLMs) has raised growing concerns about their misuse risks and associated safety issues. While prior studies have examined the safety of LLMs in general usage, code generation, and agent-based applications, their vulnerabilities in automated algorithm design remain underexplored. To fill this gap, this study investigates this overlooked safety vulnerability, with a particular focus on intelligent optimization algorithm design, given its prevalent use in complex decision-making scenarios. We introduce MalOptBench, a benchmark consisting of 60 malicious optimization algorithm requests, and propose MOBjailbreak, a jailbreak method tailored for this scenario. Through extensive evaluation of 13 mainstream LLMs including the latest GPT-5 and DeepSeek-V3.1, we reveal that most models remain highly susceptible to such attacks, with an average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education · Security and Verification in Computing
