QUEEN: Query Unlearning against Model Extraction
Huajie Chen, Tianqing Zhu, Lefeng Zhang, Bo Liu, Derui Wang, Wanlei, Zhou, Minhui Xue

TL;DR
QUEEN is a proactive defense mechanism against model extraction attacks that uses sensitivity measurement and query unlearning to prevent adversaries from training high-performance piracy models.
Contribution
It introduces QUEEN, a novel method combining sensitivity analysis and query unlearning to proactively defend against model extraction attacks.
Findings
QUEEN outperforms existing defenses in resisting model extraction.
It maintains high model accuracy while reducing attack success.
The approach is cost-effective and publicly available.
Abstract
Model extraction attacks currently pose a non-negligible threat to the security and privacy of deep learning models. By querying the model with a small dataset and usingthe query results as the ground-truth labels, an adversary can steal a piracy model with performance comparable to the original model. Two key issues that cause the threat are, on the one hand, accurate and unlimited queries can be obtained by the adversary; on the other hand, the adversary can aggregate the query results to train the model step by step. The existing defenses usually employ model watermarking or fingerprinting to protect the ownership. However, these methods cannot proactively prevent the violation from happening. To mitigate the threat, we propose QUEEN (QUEry unlEarNing) that proactively launches counterattacks on potential model extraction attacks from the very beginning. To limit the potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Database Systems and Queries · Neural Networks and Applications
MethodsSoftmax
