LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges

Haoyang Li; Huan Gao; Zhiyuan Zhao; Zhiyu Lin; Junyu Gao; Xuelong Li

arXiv:2506.10022·cs.CR·June 13, 2025

LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges

Haoyang Li, Huan Gao, Zhiyuan Zhao, Zhiyu Lin, Junyu Gao, Xuelong Li

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MalwareBench, a benchmark dataset to evaluate LLMs' vulnerability to jailbreak attacks in malicious code generation, revealing significant security challenges in current models.

Contribution

We created MalwareBench, a comprehensive benchmark with 3,520 prompts covering multiple jailbreak methods, to systematically assess LLM security against malicious code generation.

Findings

01

Mainstream LLMs have limited ability to reject malicious code requests.

02

Combining multiple jailbreak methods significantly reduces model security.

03

Average rejection rate drops to 39.92% with combined jailbreak attacks.

Abstract

The widespread adoption of Large Language Models (LLMs) has heightened concerns about their security, particularly their vulnerability to jailbreak attacks that leverage crafted prompts to generate malicious outputs. While prior research has been conducted on general security capabilities of LLMs, their specific susceptibility to jailbreak attacks in code generation remains largely unexplored. To fill this gap, we propose MalwareBench, a benchmark dataset containing 3,520 jailbreaking prompts for malicious code-generation, designed to evaluate LLM robustness against such threats. MalwareBench is based on 320 manually crafted malicious code generation requirements, covering 11 jailbreak methods and 29 code functionality categories. Experiments show that mainstream LLMs exhibit limited ability to reject malicious code-generation requirements, and the combination of multiple jailbreak…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MAIL-Tele-AI/MalwareBench
noneOfficial

Videos

LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges· underline

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Spam and Phishing Detection