On Collective Robustness of Bagging Against Data Poisoning
Ruoxin Chen, Zenan Li, Jie Li, Chentao Wu, Junchi Yan

TL;DR
This paper introduces a collective certification method for general bagging algorithms to compute tight robustness bounds against global data poisoning attacks, and proposes hash bagging to enhance robustness with minimal modifications.
Contribution
It presents the first collective robustness certification for general bagging, including a hash bagging variant that improves robustness by controlling influence scope.
Findings
Provides tight robustness bounds against global poisoning attacks.
Hash bagging significantly improves robustness with minimal changes.
Demonstrates superior applicability and robustness in extensive experiments.
Abstract
Bootstrap aggregating (bagging) is an effective ensemble protocol, which is believed can enhance robustness by its majority voting mechanism. Recent works further prove the sample-wise robustness certificates for certain forms of bagging (e.g. partition aggregation). Beyond these particular forms, in this paper, \emph{we propose the first collective certification for general bagging to compute the tight robustness against the global poisoning attack}. Specifically, we compute the maximum number of simultaneously changed predictions via solving a binary integer linear programming (BILP) problem. Then we analyze the robustness of vanilla bagging and give the upper bound of the tolerable poison budget. Based on this analysis, \emph{we propose hash bagging} to improve the robustness of vanilla bagging almost for free. This is achieved by modifying the random subsampling in vanilla bagging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Network Security and Intrusion Detection · Spam and Phishing Detection
