Stealthy Backdoors as Compression Artifacts
Yulong Tian, Fnu Suya, Fengyuan Xu, David Evans

TL;DR
This paper reveals that model compression techniques like pruning and quantization can be exploited by adversaries to embed stealthy backdoors that are hidden in the full-sized model but become effective after compression.
Contribution
The study introduces novel backdoor attack methods that leverage model compression artifacts, highlighting a new security vulnerability in compressed models.
Findings
Backdoors can be hidden in models and only become effective after compression.
State-of-the-art testing may not detect these stealthy backdoors before deployment.
Security assessments should include the compressed models, not just the original ones.
Abstract
In a backdoor attack on a machine learning model, an adversary produces a model that performs well on normal inputs but outputs targeted misclassifications on inputs containing a small trigger pattern. Model compression is a widely-used approach for reducing the size of deep learning models without much accuracy loss, enabling resource-hungry models to be compressed for use on resource-constrained devices. In this paper, we study the risk that model compression could provide an opportunity for adversaries to inject stealthy backdoors. We design stealthy backdoor attacks such that the full-sized model released by adversaries appears to be free from backdoors (even when tested using state-of-the-art techniques), but when the model is compressed it exhibits highly effective backdoors. We show this can be done for two common model compression techniques -- model pruning and model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
MethodsPruning
