How to Steer Your Adversary: Targeted and Efficient Model Stealing   Defenses with Gradient Redirection

Mantas Mazeika; Bo Li; David Forsyth

arXiv:2206.14157·cs.LG·June 29, 2022

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

Mantas Mazeika, Bo Li, David Forsyth

PDF

Open Access 1 Repo

TL;DR

This paper introduces GRAD², a novel, efficient defense mechanism against model stealing attacks that effectively redirects adversaries' training updates, maintaining model utility while reducing computational costs.

Contribution

The paper proposes a provably optimal gradient redirection algorithm and a coordinated defense strategy, significantly improving model stealing defenses over prior methods.

Findings

01

GRAD² outperforms previous defenses in accuracy and efficiency.

02

The method maintains high utility with low computational overhead.

03

Gradient redirection enables reprogramming adversaries' behavior.

Abstract

Model stealing attacks present a dilemma for public machine learning APIs. To protect financial investments, companies may be forced to withhold important information about their models that could facilitate theft, including uncertainty estimates and prediction explanations. This compromise is harmful not only to users but also to external transparency. Model stealing defenses seek to resolve this dilemma by making models harder to steal while preserving utility for benign users. However, existing defenses have poor performance in practice, either requiring enormous computational overheads or severe utility trade-offs. To meet these challenges, we present a new approach to model stealing defenses called gradient redirection. At the core of our approach is a provably optimal, efficient algorithm for steering an adversary's training updates in a targeted manner. Combined with improvements…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mmazeika/model-stealing-defenses
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)