Multi-granularity Interactive Attention Framework for Residual Hierarchical Pronunciation Assessment
Hong Han, Hao-Chen Pei, Zhao-Zheng Nie, Xin Luo, Xin-Shun Xu

TL;DR
This paper introduces a bidirectional hierarchical interactive attention framework for pronunciation assessment, improving the modeling of acoustic features across phoneme, word, and utterance levels to enhance performance.
Contribution
The proposed residual hierarchical interactive attention (HIA) model enables bidirectional feature interaction across granularities, addressing limitations of unidirectional dependencies in prior methods.
Findings
Outperforms existing state-of-the-art methods on speechocean762 dataset
Effectively captures multi-granularity acoustic correlations
Enhances local contextual cue extraction with 1-D convolutional layers
Abstract
Automatic pronunciation assessment plays a crucial role in computer-assisted pronunciation training systems. Due to the ability to perform multiple pronunciation tasks simultaneously, multi-aspect multi-granularity pronunciation assessment methods are gradually receiving more attention and achieving better performance than single-level modeling tasks. However, existing methods only consider unidirectional dependencies between adjacent granularity levels, lacking bidirectional interaction among phoneme, word, and utterance levels and thus insufficiently capturing the acoustic structural correlations. To address this issue, we propose a novel residual hierarchical interactive method, HIA for short, that enables bidirectional modeling across granularities. As the core of HIA, the Interactive Attention Module leverages an attention mechanism to achieve dynamic bidirectional interaction,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research · Voice and Speech Disorders
