Typos that Broke the RAG's Back: Genetic Attack on RAG Pipeline by Simulating Documents in the Wild via Low-level Perturbations
Sukmin Cho, Soyeong Jeong, Jeongyeon Seo, Taeho Hwang, Jong C. Park

TL;DR
This paper introduces GARAG, a genetic attack method that tests the robustness of Retrieval-Augmented Generation systems against low-level textual perturbations, revealing significant vulnerabilities in real-world scenarios.
Contribution
We propose GARAG, a novel genetic attack approach that exposes the fragility of RAG systems to minor textual errors and provides a holistic robustness evaluation.
Findings
GARAG achieves high attack success rates on RAG systems.
Minor textual errors can significantly degrade RAG performance.
RAG components are highly vulnerable to low-level perturbations.
Abstract
The robustness of recent Large Language Models (LLMs) has become increasingly crucial as their applicability expands across various domains and real-world applications. Retrieval-Augmented Generation (RAG) is a promising solution for addressing the limitations of LLMs, yet existing studies on the robustness of RAG often overlook the interconnected relationships between RAG components or the potential threats prevalent in real-world databases, such as minor textual errors. In this work, we investigate two underexplored aspects when assessing the robustness of RAG: 1) vulnerability to noisy documents through low-level perturbations and 2) a holistic evaluation of RAG robustness. Furthermore, we introduce a novel attack method, the Genetic Attack on RAG (\textit{GARAG}), which targets these aspects. Specifically, GARAG is designed to reveal vulnerabilities within each component and test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Malware Detection Techniques · Particle accelerators and beam dynamics · Digital Media Forensic Detection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Byte Pair Encoding · Dense Connections · Residual Connection · Softmax · Adam · Linear Warmup With Linear Decay · Layer Normalization
