Model Extraction Attack against Self-supervised Speech Models
Tsu-Yuan Hsu, Chen-An Li, Tung-Yu Wu, Hung-yi Lee

TL;DR
This paper investigates how adversaries can perform model extraction attacks on self-supervised speech models using limited queries, proposing a two-stage method that effectively replicates the target model's functionality.
Contribution
It introduces a novel two-stage framework for extracting SSL speech models with minimal queries, without requiring knowledge of the target model's architecture.
Findings
Sampling methods effectively extract target models
High fidelity in model replication achieved
No prior knowledge of model architecture needed
Abstract
Self-supervised learning (SSL) speech models generate meaningful representations of given clips and achieve incredible performance across various downstream tasks. Model extraction attack (MEA) often refers to an adversary stealing the functionality of the victim model with only query access. In this work, we study the MEA problem against SSL speech model with a small number of queries. We propose a two-stage framework to extract the model. In the first stage, SSL is conducted on the large-scale unlabeled corpus to pre-train a small speech model. Secondly, we actively sample a small portion of clips from the unlabeled corpus and query the target model with these clips to acquire their representations as labels for the small model's second-stage training. Experiment results show that our sampling methods can effectively extract the target model without knowing any information about its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
