Mithridates: Auditing and Boosting Backdoor Resistance of Machine Learning Pipelines
Eugene Bagdasaryan, Vitaly Shmatikov

TL;DR
This paper introduces Mithridates, a practical tool that helps ML engineers evaluate and improve the resistance of training pipelines against backdoor poisoning attacks, balancing security and accuracy.
Contribution
It proposes a universal backdoor resistance metric and integrates it into hyperparameter search, enabling resistant training configurations without major pipeline changes.
Findings
Resistance increased by 3-5x with minimal accuracy loss
Universal attack-agnostic resistance metric developed
Extensions to AutoML and federated learning discussed
Abstract
Machine learning (ML) models trained on data from potentially untrusted sources are vulnerable to poisoning. A small, maliciously crafted subset of the training inputs can cause the model to learn a "backdoor" task (e.g., misclassify inputs with a certain feature) in addition to its main task. Recent research proposed many hypothetical backdoor attacks whose efficacy heavily depends on the configuration and training hyperparameters of the target model. Given the variety of potential backdoor attacks, ML engineers who are not security experts have no way to measure how vulnerable their current training pipelines are, nor do they have a practical way to compare training configurations so as to pick the more resistant ones. Deploying a defense requires evaluating and choosing from among dozens of research papers and re-engineering the training pipeline. In this paper, we aim to provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Engineering Research
