Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

Wentao Zhang; Yan Zhuang; ZhuHang Zheng; Mingfei Zhang; Jiawen Deng; Fuji Ren

arXiv:2604.18663·cs.CR·April 22, 2026

Beyond Explicit Refusals: Soft-Failure Attacks on Retrieval-Augmented Generation

Wentao Zhang, Yan Zhuang, ZhuHang Zheng, Mingfei Zhang, Jiawen Deng, Fuji Ren

PDF

TL;DR

This paper introduces DEJA, a black-box attack that subtly degrades retrieval-augmented generation systems by inducing non-informative yet fluent responses, highlighting a new type of stealthy threat.

Contribution

It formalizes the concept of soft failure in RAG systems and proposes DEJA, an evolutionary attack framework that effectively induces low-utility responses while remaining stealthy.

Findings

01

DEJA achieves over 79% success in inducing soft failures.

02

It maintains low hard-failure rates below 15%.

03

Adversarial documents evade detection and transfer across models.

Abstract

Existing jamming attacks on Retrieval-Augmented Generation (RAG) systems typically induce explicit refusals or denial-of-service behaviors, which are conspicuous and easy to detect. In this work, we formalize a subtler availability threat, termed soft failure, which degrades system utility by inducing fluent and coherent yet non-informative responses rather than overt failures. We propose Deceptive Evolutionary Jamming Attack (DEJA), an automated black-box attack framework that generates adversarial documents to trigger such soft failures by exploiting safety-aligned behaviors of large language models. DEJA employs an evolutionary optimization process guided by a fine-grained Answer Utility Score (AUS), computed via an LLM-based evaluator, to systematically degrade the certainty of answers while maintaining high retrieval success. Extensive experiments across multiple RAG configurations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.