Corpus Poisoning via Approximate Greedy Gradient Descent
Jinyan Su, Preslav Nakov, Claire Cardie

TL;DR
This paper introduces AGGD, a new structured adversarial attack method on dense retrieval systems that significantly improves attack success rates and can generalize across datasets and domains, raising concerns about their robustness.
Contribution
We propose AGGD, a novel structured attack method that outperforms HotFlip in generating adversarial passages for dense retrieval models, and demonstrate its effectiveness across multiple datasets and applications.
Findings
AGGD achieves higher attack success rates than HotFlip on several datasets.
The method generalizes well to unseen queries and new domains.
AGGD effectively attacks the ANCE retrieval model, surpassing previous methods.
Abstract
Dense retrievers are widely used in information retrieval and have also been successfully extended to other knowledge intensive areas such as language models, e.g., Retrieval-Augmented Generation (RAG) systems. Unfortunately, they have recently been shown to be vulnerable to corpus poisoning attacks in which a malicious user injects a small fraction of adversarial passages into the retrieval corpus to trick the system into returning these passages among the top-ranked results for a broad set of user queries. Further study is needed to understand the extent to which these attacks could limit the deployment of dense retrievers in real-world applications. In this work, we propose Approximate Greedy Gradient Descent (AGGD), a new attack on dense retrieval systems based on the widely used HotFlip method for efficiently generating adversarial passages. We demonstrate that AGGD can select a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Handwritten Text Recognition Techniques · Text and Document Classification Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · WordPiece · Linear Warmup With Linear Decay · Attention Dropout · Weight Decay · Residual Connection · Softmax · Layer Normalization
