Orthogonality Constrained Multi-Head Attention For Keyword Spotting

Mingu Lee; Jinkyu Lee; Hye Jin Jang; Byeonggeun Kim; Wonil Chang and; Kyuwoong Hwang

arXiv:1910.04500·cs.LG·October 11, 2019

Orthogonality Constrained Multi-Head Attention For Keyword Spotting

Mingu Lee, Jinkyu Lee, Hye Jin Jang, Byeonggeun Kim, Wonil Chang and, Kyuwoong Hwang

PDF

Open Access

TL;DR

This paper introduces an orthogonality regularization for multi-head attention in neural keyword spotting, which enhances the diversity and consistency of attention heads, leading to improved detection accuracy.

Contribution

It proposes a novel regularization method that enforces orthogonality among attention heads, reducing redundancy and improving keyword spotting performance without explicit sequence models.

Findings

01

Significant accuracy improvement on 'Hey Snapdragon' keyword detection

02

Regularization reduces redundancy among attention heads

03

Enhanced feature consistency across keyword examples

Abstract

Multi-head attention mechanism is capable of learning various representations from sequential data while paying attention to different subsequences, e.g., word-pieces or syllables in a spoken word. From the subsequences, it retrieves richer information than a single-head attention which only summarizes the whole sequence into one context vector. However, a naive use of the multi-head attention does not guarantee such richness as the attention heads may have positional and representational redundancy. In this paper, we propose a regularization technique for multi-head attention mechanism in an end-to-end neural keyword spotting system. Augmenting regularization terms which penalize positional and contextual non-orthogonality between the attention heads encourages to output different representations from separate subsequences, which in turn enables leveraging structured information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention