Orthogonality Constrained Multi-Head Attention For Keyword Spotting
Mingu Lee, Jinkyu Lee, Hye Jin Jang, Byeonggeun Kim, Wonil Chang and, Kyuwoong Hwang

TL;DR
This paper introduces an orthogonality regularization for multi-head attention in neural keyword spotting, which enhances the diversity and consistency of attention heads, leading to improved detection accuracy.
Contribution
It proposes a novel regularization method that enforces orthogonality among attention heads, reducing redundancy and improving keyword spotting performance without explicit sequence models.
Findings
Significant accuracy improvement on 'Hey Snapdragon' keyword detection
Regularization reduces redundancy among attention heads
Enhanced feature consistency across keyword examples
Abstract
Multi-head attention mechanism is capable of learning various representations from sequential data while paying attention to different subsequences, e.g., word-pieces or syllables in a spoken word. From the subsequences, it retrieves richer information than a single-head attention which only summarizes the whole sequence into one context vector. However, a naive use of the multi-head attention does not guarantee such richness as the attention heads may have positional and representational redundancy. In this paper, we propose a regularization technique for multi-head attention mechanism in an end-to-end neural keyword spotting system. Augmenting regularization terms which penalize positional and contextual non-orthogonality between the attention heads encourages to output different representations from separate subsequences, which in turn enables leveraging structured information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Advanced Text Analysis Techniques
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention
