Stealthy Backdoors as Compression Artifacts

Yulong Tian; Fnu Suya; Fengyuan Xu; David Evans

arXiv:2104.15129·cs.CR·May 3, 2021

Stealthy Backdoors as Compression Artifacts

Yulong Tian, Fnu Suya, Fengyuan Xu, David Evans

PDF

Open Access 1 Repo

TL;DR

This paper reveals that model compression techniques like pruning and quantization can be exploited by adversaries to embed stealthy backdoors that are hidden in the full-sized model but become effective after compression.

Contribution

The study introduces novel backdoor attack methods that leverage model compression artifacts, highlighting a new security vulnerability in compressed models.

Findings

01

Backdoors can be hidden in models and only become effective after compression.

02

State-of-the-art testing may not detect these stealthy backdoors before deployment.

03

Security assessments should include the compressed models, not just the original ones.

Abstract

In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern. Model compression is a widely-used approach for reducing the size of deep learning models without much accuracy loss, enabling resource-hungry models to be compressed for use on resource-constrained devices. In this paper, we study the risk that model compression could provide an opportunity for adversaries to inject stealthy backdoors. We design stealthy backdoor attacks such that the full-sized model released by adversaries appears to be free from backdoors (even when tested using state-of-the-art techniques), but when the model is compressed it exhibits highly effective backdoors. We show this can be done for two common model compression techniques -- model pruning and model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yulongtzzz/Stealthy-Backdoors-as-Compression-Artifacts
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)

MethodsPruning