Few-shot Backdoor Attacks via Neural Tangent Kernels

Jonathan Hayase; Sewoong Oh

arXiv:2210.05929·cs.LG·October 13, 2022·5 cites

Few-shot Backdoor Attacks via Neural Tangent Kernels

Jonathan Hayase, Sewoong Oh

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel method for designing effective backdoor attacks on neural networks using neural tangent kernels, achieving high success rates with fewer poisoned examples and revealing vulnerabilities in overparameterized models.

Contribution

It proposes a bilevel optimization framework leveraging neural tangent kernels to craft potent backdoor poison examples, improving attack efficiency and understanding neural network vulnerabilities.

Findings

01

Achieves 90% attack success with ten times fewer poison examples.

02

Demonstrates vulnerability in overparameterized neural networks.

03

Provides kernel-based interpretation of attack mechanisms.

Abstract

In a backdoor attack, an attacker injects corrupted examples into the training set. The goal of the attacker is to cause the final trained model to predict the attacker's desired target label when a predefined trigger is added to test inputs. Central to these attacks is the trade-off between the success rate of the attack and the number of corrupted training examples injected. We pose this attack as a novel bilevel optimization problem: construct strong poison examples that maximize the attack success rate of the trained model. We use neural tangent kernels to approximate the training dynamics of the model being attacked and automatically learn strong poison examples. We experiment on subclasses of CIFAR-10 and ImageNet with WideResNet-34 and ConvNeXt architectures on periodic and patch trigger attacks and show that NTBA-designed poisoned examples achieve, for example, an attack success…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SewoongLab/ntk-backdoor
jaxOfficial

Videos

Few-shot Backdoor Attacks via Neural Tangent Kernels· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsConvNeXt · Test