Defending Against Disinformation Attacks in Open-Domain Question Answering
Orion Weller, Aleem Khan, Nathaniel Weir, Dawn Lawrie, Benjamin Van, Durme

TL;DR
This paper introduces a novel defense mechanism for open-domain question answering systems against adversarial poisoning attacks by leveraging redundant information in large corpora and a confidence measure called CAR, significantly improving robustness.
Contribution
The paper proposes a new query augmentation and confidence scoring method, CAR, to defend ODQA systems from poisoning attacks, a problem previously unaddressed in the literature.
Findings
Nearly 20% improvement in exact match accuracy under poisoning conditions
Effective detection of poisoned passages using redundancy-based confidence scores
Robustness gains across various levels of data poisoning
Abstract
Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query augmentation to search for a diverse set of passages that could answer the original question but are less likely to have been poisoned. We integrate these new passages into the model through the design of a novel confidence method, comparing the predicted answer to its appearance in the retrieved contexts (what we call Confidence from Answer Redundancy, i.e. CAR). Together these methods allow for a simple but effective way to defend against poisoning attacks that provides gains of nearly 20% exact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Access Control and Trust · Epistemology, Ethics, and Metaphysics
