EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion
Zhen Liang, Hai Huang, Zhengkui Chen

TL;DR
EquaCode introduces a multi-strategy jailbreak method combining equation solving and code completion to effectively bypass large language model safety measures, revealing vulnerabilities across multiple models with high success rates.
Contribution
The paper presents a novel multi-strategy jailbreak approach that outperforms single-strategy methods, significantly improving the effectiveness of testing LLM robustness against safety constraints.
Findings
Achieves over 91% success rate on GPT series models
Reaches 98.65% success across three state-of-the-art LLMs
Demonstrates the synergistic effect of combining equation solving and code completion
Abstract
Large language models (LLMs), such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains a significant concern, as they are still susceptible to jailbreak attacks aimed at eliciting inappropriate or harmful responses. However, existing jailbreak attacks mainly operate at the natural language level and rely on a single attack strategy, limiting their effectiveness in comprehensively assessing LLM robustness. In this paper, we propose Equacode, a novel multi-strategy jailbreak approach for large language models via equation-solving and code completion. This approach transforms malicious intent into a mathematical problem and then requires the LLM to solve it using code, leveraging the complexity of cross-domain tasks to divert the model's focus toward task completion rather than safety constraints. Experimental results show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Adversarial Robustness in Machine Learning · Scientific Computing and Data Management
