AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
Yang Xiao, Tianyi Peng, Yanghao Zhou, Rohan Kumar Das

TL;DR
AdaKWS introduces a novel test-time adaptation method for keyword spotting that enhances robustness in noisy and unseen environments by optimizing confidence and feature consistency without retraining.
Contribution
This paper presents AdaKWS, the first TTA approach specifically designed for robust keyword spotting in diverse and noisy conditions.
Findings
Outperforms existing methods in noisy environments
Improves accuracy without retraining models
Effective in real-world and synthetic noise scenarios
Abstract
Spoken keyword spotting (KWS) aims to identify keywords in audio for wide applications, especially on edge devices. Current small-footprint KWS systems focus on efficient model designs. However, their inference performance can decline in unseen environments or noisy backgrounds. Test-time adaptation (TTA) helps models adapt to test samples without needing the original training data. In this study, we present AdaKWS, the first TTA method for robust KWS to the best of our knowledge. Specifically, 1) We initially optimize the model's confidence by selecting reliable samples based on prediction entropy minimization and adjusting the normalization statistics in each batch. 2) We introduce pseudo-keyword consistency (PKC) to identify critical, reliable features without overfitting to noise. Our experiments show that AdaKWS outperforms other methods across various conditions, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Text and Document Classification Technologies
MethodsFocus
