Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Qiongkai Xu; Xuanli He; Lingjuan Lyu; Lizhen Qu; Gholamreza Haffari

arXiv:2108.13873·cs.CR·September 7, 2022·5 cites

Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, Gholamreza Haffari

PDF

Open Access

TL;DR

This paper demonstrates that imitation attacks on black-box NLP APIs can not only steal models but also surpass the original models' performance through unsupervised domain adaptation and ensemble methods, challenging previous assumptions.

Contribution

It introduces a novel approach combining unsupervised domain adaptation and multi-victim ensemble to surpass the original black-box NLP models in imitation attacks.

Findings

01

Imitators can outperform original black-box models on transferred domains.

02

The approach works on both benchmark datasets and real-world APIs.

03

Surpassing original models impacts API providers' defense and publishing strategies.

Abstract

Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we conduct unsupervised domain adaptation and multi-victim ensemble to showing that attackers could potentially surpass victims, which is beyond previous understanding of model extraction. Extensive experiments on both benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models on transferred domains. We consider our work as a milestone in the research of imitation attack, especially on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning