Distilling the Undistillable: Learning from a Nasty Teacher

Surgan Jandial; Yash Khasbage; Arghya Pal; Vineeth N Balasubramanian,; Balaji Krishnamurthy

arXiv:2210.11728·cs.CV·October 24, 2022

Distilling the Undistillable: Learning from a Nasty Teacher

Surgan Jandial, Yash Khasbage, Arghya Pal, Vineeth N Balasubramanian,, Balaji Krishnamurthy

PDF

Open Access 1 Repo

TL;DR

This paper investigates the confidentiality of Nasty Teacher models in knowledge distillation, proposing methods to extract information despite defenses, and introduces improved techniques to enhance learning from such models.

Contribution

It develops novel methodologies HTC and SCM to effectively steal information from Nasty Teacher models, surpassing previous limitations and strengthening understanding of model confidentiality.

Findings

01

HTC and SCM increase learning from Nasty Teacher by up to 68.63%.

02

The proposed methods outperform existing approaches in extracting information.

03

An improved defense method is also proposed based on insights from the attack strategies.

Abstract

The inadvertent stealing of private/sensitive information using Knowledge Distillation (KD) has been getting significant attention recently and has guided subsequent defense efforts considering its critical nature. Recent work Nasty Teacher proposed to develop teachers which can not be distilled or imitated by models attacking it. However, the promise of confidentiality offered by a nasty teacher is not well studied, and as a further step to strengthen against such loopholes, we attempt to bypass its defense and steal (or extract) information in its presence successfully. Specifically, we analyze Nasty Teacher from two different directions and subsequently leverage them carefully to develop simple yet efficient methodologies, named as HTC and SCM, which increase the learning from Nasty Teacher by upto 68.63% on standard datasets. Additionally, we also explore an improvised defense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

surgan12/nastyattacks
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting · Deception detection and forensic psychology

MethodsRegion Proposal Network · RoIAlign · 1x1 Convolution · Feature Pyramid Network · Convolution · Hybrid Task Cascade · Knowledge Distillation