Query-by-Example Keyword Spotting system using Multi-head Attention and   Softtriple Loss

Jinmiao Huang; Waseem Gharbieh; Han Suk Shim; Eugene Kim

arXiv:2102.07061·cs.CL·May 11, 2021

Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Jinmiao Huang, Waseem Gharbieh, Han Suk Shim, Eugene Kim

PDF

Open Access

TL;DR

This paper introduces a neural network for query-by-example keyword spotting that combines multi-head attention, a multi-layered GRU, and softtriple loss, demonstrating effective performance across multiple datasets.

Contribution

It presents a novel architecture integrating multi-head attention with softtriple loss for improved user-defined keyword spotting.

Findings

01

Effective on internal and public datasets

02

Outperforms baseline systems

03

Component ablation confirms architecture benefits

Abstract

This paper proposes a neural network architecture for tackling the query-by-example user-defined keyword spotting task. A multi-head attention module is added on top of a multi-layered GRU for effective feature extraction, and a normalized multi-head attention module is proposed for feature aggregation. We also adopt the softtriple loss - a combination of triplet loss and softmax loss - and showcase its effectiveness. We demonstrate the performance of our model on internal datasets with different languages and the public Hey-Snips dataset. We compare the performance of our model to a baseline system and conduct an ablation study to show the benefit of each component in our architecture. The proposed work shows solid performance while preserving simplicity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Advanced Text Analysis Techniques · Topic Modeling

MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Gated Recurrent Unit · Triplet Loss