Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs
Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, Gholamreza Haffari

TL;DR
This paper demonstrates that imitation attacks on black-box NLP APIs can not only steal models but also surpass the original models' performance through unsupervised domain adaptation and ensemble methods, challenging previous assumptions.
Contribution
It introduces a novel approach combining unsupervised domain adaptation and multi-victim ensemble to surpass the original black-box NLP models in imitation attacks.
Findings
Imitators can outperform original black-box models on transferred domains.
The approach works on both benchmark datasets and real-world APIs.
Surpassing original models impacts API providers' defense and publishing strategies.
Abstract
Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we conduct unsupervised domain adaptation and multi-victim ensemble to showing that attackers could potentially surpass victims, which is beyond previous understanding of model extraction. Extensive experiments on both benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models on transferred domains. We consider our work as a milestone in the research of imitation attack, especially on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
