DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection
Junyu Ren, Xingjian Pan, Wensheng Gan, Philip S. Yu

TL;DR
This paper introduces PromptFuzz-SC, a dual-space mutation framework combining semantic and character-level perturbations to evaluate and improve the robustness of large language models like DeepSeek against prompt injection attacks.
Contribution
It presents a novel unified mutation framework and evaluation protocol that effectively uncovers vulnerabilities in LLMs under complex, realistic prompt injection scenarios.
Findings
Dual-space mutation achieves the strongest attack performance.
It improves mean misuse success rate by 12.5% over semantic-only methods.
The approach balances attack effectiveness with imperceptibility.
Abstract
Prompt injection has emerged as a critical security threat to large language models (LLMs), yet existing studies predominantly focus on single-dimensional attack strategies, such as semantic rewriting or character-level obfuscation, which fail to capture the combined effects of multi-space perturbations in realistic scenarios. In addition, systematic black-box robustness evaluations of recent Chinese LLMs, such as DeepSeek, remain limited. To address these gaps, we propose PromptFuzz-SC, a semantic-character dual-space mutation framework for evaluating LLM robustness against prompt injection. The framework integrates semantic transformations (e.g., paraphrasing and word-order perturbation) with character-level obfuscation (e.g., zero-width insertion and encoding-based mutation), forming a unified and extensible mutation operator library. A hybrid search strategy combining epsilon-greedy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
