On Optimal Learning Under Targeted Data Poisoning

Steve Hanneke; Amin Karbasi; Mohammad Mahmoody; Idan Mehalel; Shay; Moran

arXiv:2210.02713·cs.LG·October 13, 2022

On Optimal Learning Under Targeted Data Poisoning

Steve Hanneke, Amin Karbasi, Mohammad Mahmoody, Idan Mehalel, Shay, Moran

PDF

Open Access 1 Video

TL;DR

This paper characterizes the minimal error achievable by learners under targeted data poisoning attacks, providing optimal bounds in realizable and agnostic settings, and explores proper algorithms for specific concept classes.

Contribution

It offers the first precise characterization of optimal error bounds under targeted poisoning, including deterministic algorithms and their limitations.

Findings

01

In realizable setting, error scales as VC dimension times poisoning fraction.

02

In agnostic setting, a multiplicative regret bound is achievable, but deterioration can be unavoidable.

03

Proper algorithms can achieve these bounds for certain classes like linear classifiers.

Abstract

Consider the task of learning a hypothesis class $H$ in the presence of an adversary that can replace up to an $η$ fraction of the examples in the training set with arbitrary adversarial examples. The adversary aims to fail the learner on a particular target test point $x$ which is known to the adversary but not to the learner. In this work we aim to characterize the smallest achievable error $ϵ = ϵ (η)$ by the learner in the presence of such an adversary in both realizable and agnostic settings. We fully achieve this in the realizable setting, proving that $ϵ = Θ (VC (H) \cdot η)$ , where $VC (H)$ is the VC dimension of $H$ . Remarkably, we show that the upper bound can be attained by a deterministic learner. In the agnostic setting we reveal a more elaborate landscape: we devise a deterministic learner…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Optimal Learning Under Targeted Data Poisoning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning

MethodsTest