Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Haoran Gu; Handing Wang; Yi Mei; Mengjie Zhang; Yaochu Jin

arXiv:2601.00213·cs.CR·January 5, 2026

Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Haoran Gu, Handing Wang, Yi Mei, Mengjie Zhang, Yaochu Jin

PDF

Open Access

TL;DR

This paper uncovers a significant safety vulnerability in large language models related to malicious requests for intelligent optimization algorithms, demonstrating high susceptibility and limited defenses, thus emphasizing the need for improved safety measures.

Contribution

It introduces MalOptBench, a benchmark for malicious optimization requests, and MOBjailbreak, a tailored jailbreak method, revealing vulnerabilities in 13 mainstream LLMs.

Findings

01

Most models are highly susceptible with an attack success rate of 83.59%.

02

Existing defenses are only marginally effective against MOBjailbreak.

03

Models show near-complete failure under the proposed jailbreak method.

Abstract

The widespread deployment of large language models (LLMs) has raised growing concerns about their misuse risks and associated safety issues. While prior studies have examined the safety of LLMs in general usage, code generation, and agent-based applications, their vulnerabilities in automated algorithm design remain underexplored. To fill this gap, this study investigates this overlooked safety vulnerability, with a particular focus on intelligent optimization algorithm design, given its prevalent use in complex decision-making scenarios. We introduce MalOptBench, a benchmark consisting of 60 malicious optimization algorithm requests, and propose MOBjailbreak, a jailbreak method tailored for this scenario. Through extensive evaluation of 13 mainstream LLMs including the latest GPT-5 and DeepSeek-V3.1, we reveal that most models remain highly susceptible to such attacks, with an average…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education · Security and Verification in Computing