Explaining the Attention Mechanism of End-to-End Speech Recognition   Using Decision Trees

Yuanchao Wang; Wenji Du; Chenghao Cai; Yanyan Xu

arXiv:2110.03879·cs.CL·October 11, 2021

Explaining the Attention Mechanism of End-to-End Speech Recognition Using Decision Trees

Yuanchao Wang, Wenji Du, Chenghao Cai, Yanyan Xu

PDF

Open Access

TL;DR

This paper uses decision trees to analyze the attention mechanism in end-to-end speech recognition, revealing that attention is mainly influenced by previous states and struggles with long-term dependencies.

Contribution

It introduces a decision tree-based approach to interpret the attention mechanism, providing new insights into its behavior in speech recognition systems.

Findings

01

Attention is mainly influenced by previous states

02

Default attention favors closer states

03

Poor modeling of long-term dependencies

Abstract

The attention mechanism has largely improved the performance of end-to-end speech recognition systems. However, the underlying behaviours of attention is not yet clearer. In this study, we use decision trees to explain how the attention mechanism impact itself in speech recognition. The results indicate that attention levels are largely impacted by their previous states rather than the encoder and decoder patterns. Additionally, the default attention mechanism seems to put more weights on closer states, but behaves poorly on modelling long-term dependencies of attention states.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Neural Networks and Applications