Regularization Helps with Mitigating Poisoning Attacks:   Distributionally-Robust Machine Learning Using the Wasserstein Distance

Farhad Farokhi

arXiv:2001.10655·cs.LG·January 30, 2020·5 cites

Regularization Helps with Mitigating Poisoning Attacks: Distributionally-Robust Machine Learning Using the Wasserstein Distance

Farhad Farokhi

PDF

Open Access

TL;DR

This paper introduces a distributionally-robust optimization approach using Wasserstein distance to improve machine learning model resilience against data poisoning attacks, providing theoretical guarantees and practical demonstrations.

Contribution

It proposes a novel regularization method based on Wasserstein distance for mitigating poisoning attacks, with theoretical performance guarantees and empirical validation.

Findings

01

Regularization based on Wasserstein distance improves robustness.

02

The method provides performance guarantees on unpoisoned data.

03

Empirical results on multiple datasets demonstrate effectiveness.

Abstract

We use distributionally-robust optimization for machine learning to mitigate the effect of data poisoning attacks. We provide performance guarantees for the trained model on the original data (not including the poison records) by training the model for the worst-case distribution on a neighbourhood around the empirical distribution (extracted from the training dataset corrupted by a poisoning attack) defined using the Wasserstein distance. We relax the distributionally-robust machine learning problem by finding an upper bound for the worst-case fitness based on the empirical sampled-averaged fitness and the Lipschitz-constant of the fitness function (on the data for given model parameters) as regularizer. For regression models, we prove that this regularizer is equal to the dual norm of the model parameters. We use the Wine Quality dataset, the Boston Housing Market dataset, and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Statistical Methods and Inference · Adversarial Robustness in Machine Learning