TL;DR
This paper explores how neural networks classify vowels in spectrograms, revealing the acoustic features they rely on, and compares these with linguistic knowledge to improve speech recognition interpretability.
Contribution
It introduces a method using class activation mapping to interpret neural network decisions in vowel classification, linking neural features to linguistic cues.
Findings
Neural networks focus on specific frequency patterns for vowel classification.
Identified acoustic cues align with linguistic knowledge of vowels.
Insights into misclassification causes improve speech recognition models.
Abstract
This study investigates discriminative patterns learned by neural networks for accurate speech classification, with a specific focus on vowel classification tasks. By examining the activations and features of neural networks for vowel classification, we gain insights into what the networks "see" in spectrograms. Through the use of class activation mapping, we identify the frequencies that contribute to vowel classification and compare these findings with linguistic knowledge. Experiments on a American English dataset of vowels showcases the explainability of neural networks and provides valuable insights into the causes of misclassifications and their characteristics when differentiating them from unvoiced speech. This study not only enhances our understanding of the underlying acoustic cues in vowel classification but also offers opportunities for improving speech recognition by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Focus
