Excess Capacity and Backdoor Poisoning
Naren Sarayu Manoj, Avrim Blum

TL;DR
This paper provides a formal theoretical framework for understanding backdoor data poisoning attacks, analyzing their statistical vulnerabilities and computational detection methods, and establishing the importance of identifying watermarked examples for robust learning.
Contribution
It introduces a formal framework for backdoor attacks, defines the memorization capacity, and links backdoor detection to robust generalization, advancing understanding of attack robustness and detection.
Findings
The memorization capacity measures vulnerability to backdoors.
Some natural learning problems are inherently robust against backdoor attacks.
Adversarial training can detect backdoors under certain assumptions.
Abstract
A backdoor data poisoning attack is an adversarial attack wherein the attacker injects several watermarked, mislabeled training examples into a training set. The watermark does not impact the test-time performance of the model on typical data; however, the model reliably errs on watermarked examples. To gain a better foundational understanding of backdoor data poisoning attacks, we present a formal theoretical framework within which one can discuss backdoor data poisoning attacks for classification problems. We then use this to analyze important statistical and computational issues surrounding these attacks. On the statistical front, we identify a parameter we call the memorization capacity that captures the intrinsic vulnerability of a learning problem to a backdoor attack. This allows us to argue about the robustness of several natural learning problems to backdoor attacks. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Privacy-Preserving Technologies in Data
