PBSM: Backdoor attack against Keyword spotting based on pitch boosting   and sound masking

Hanbo Cai; Pengcheng Zhang; Hai Dong; Yan Xiao; Shunhui Ji

arXiv:2211.08697·cs.SD·November 17, 2022·5 cites

PBSM: Backdoor attack against Keyword spotting based on pitch boosting and sound masking

Hanbo Cai, Pengcheng Zhang, Hai Dong, Yan Xiao, Shunhui Ji

PDF

Open Access

TL;DR

This paper introduces PBSM, a backdoor attack method on keyword spotting systems using pitch boosting and sound masking, achieving high success rates with minimal training data poisoning.

Contribution

The paper presents a novel backdoor attack scheme for KWS leveraging pitch boosting and sound masking, demonstrating its effectiveness with low data poisoning.

Findings

01

Achieves nearly 90% attack success rate

02

Poisoning less than 1% of training data

03

Effective across multiple victim models

Abstract

Keyword spotting (KWS) has been widely used in various speech control scenarios. The training of KWS is usually based on deep neural networks and requires a large amount of data. Manufacturers often use third-party data to train KWS. However, deep neural networks are not sufficiently interpretable to manufacturers, and attackers can manipulate third-party training data to plant backdoors during the model training. An effective backdoor attack can force the model to make specified judgments under certain conditions, i.e., triggers. In this paper, we design a backdoor attack scheme based on Pitch Boosting and Sound Masking for KWS, called PBSM. Experimental results demonstrated that PBSM is feasible to achieve an average attack success rate close to 90% in three victim models when poisoning less than 1% of the training data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing