Text-Aware Adapter for Few-Shot Keyword Spotting

Youngmoon Jung; Jinyoung Lee; Seungjin Lee; Myunghun Jung; Yong-Hyeok Lee; Hoon-Young Cho

arXiv:2412.18142·eess.AS·May 27, 2025

Text-Aware Adapter for Few-Shot Keyword Spotting

Youngmoon Jung, Jinyoung Lee, Seungjin Lee, Myunghun Jung, Yong-Hyeok Lee, Hoon-Young Cho

PDF

Open Access

TL;DR

This paper introduces a text-aware adapter for few-shot keyword spotting that improves keyword detection performance with minimal additional parameters by leveraging a text encoder for better keyword representation.

Contribution

The proposed TA-adapter is a novel transfer learning method that fine-tunes only a small part of the model using text embeddings, enhancing few-shot KWS performance efficiently.

Findings

01

Significant performance improvements across 35 keywords.

02

Minimal increase of 0.14% in total parameters.

03

Effective adaptation with limited speech samples.

Abstract

Recent advances in flexible keyword spotting (KWS) with text enrollment allow users to personalize keywords without uttering them during enrollment. However, there is still room for improvement in target keyword performance. In this work, we propose a novel few-shot transfer learning method, called text-aware adapter (TA-adapter), designed to enhance a pre-trained flexible KWS model for specific keywords with limited speech samples. To adapt the acoustic encoder, we leverage a jointly pre-trained text encoder to generate a text embedding that acts as a representative vector for the keyword. By fine-tuning only a small portion of the network while keeping the core components' weights intact, the TA-adapter proves highly efficient for few-shot KWS, enabling a seamless return to the original pre-trained model. In our experiments, the TA-adapter demonstrated significant performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques

MethodsAdapter