CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice
Juan Zuluaga-Gomez, Sara Ahmed, Danielius Visockas, Cem, Subakan

TL;DR
This paper presents a new approach for accent classification using large acoustic pretrained models, achieving state-of-the-art accuracy and providing an open-source recipe for multilingual accent recognition.
Contribution
It introduces a simple recipe for accent classification with pretrained models, achieving new state-of-the-art results on English and exploring phonological clustering in embeddings.
Findings
Achieved 95% accuracy on English accent classification.
Established new state-of-the-art for multilingual accent classification.
Demonstrated phonological clustering in Wav2Vec 2.0 embeddings.
Abstract
Despite the recent advancements in Automatic Speech Recognition (ASR), the recognition of accented speech still remains a dominant problem. In order to create more inclusive ASR systems, research has shown that the integration of accent information, as part of a larger ASR framework, can lead to the mitigation of accented speech errors. We address multilingual accent classification through the ECAPA-TDNN and Wav2Vec 2.0/XLSR architectures which have been proven to perform well on a variety of speech-related downstream tasks. We introduce a simple-to-follow recipe aligned to the SpeechBrain toolkit for accent classification based on Common Voice 7.0 (English) and Common Voice 11.0 (Italian, German, and Spanish). Furthermore, we establish new state-of-the-art for English accent classification with as high as 95% accuracy. We also study the internal categorization of the Wav2Vev 2.0…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Jzuluaga/accent-id-commonaccent_ecapamodel· 15k dl· ♡ 1715k dl♡ 17
- 🤗Jzuluaga/accent-id-commonaccent_xlsr-es-spanishmodel· 16 dl· ♡ 516 dl♡ 5
- 🤗Jzuluaga/accent-id-commonaccent_xlsr-it-italianmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗Jzuluaga/accent-id-commonaccent_xlsr-en-englishmodel· 1.4k dl· ♡ 171.4k dl♡ 17
- 🤗Jzuluaga/accent-id-commonaccent_xlsr-de-germanmodel· 6 dl· ♡ 26 dl♡ 2
- 🤗warisqr7/accent-id-commonaccent_xlsr-en-englishmodel· 10 dl10 dl
- 🤗sinhprous/accent-frenchmodel· 2 dl2 dl
- 🤗bookbot/english-accent-classifiermodel· 9 dl· ♡ 29 dl♡ 2
- 🤗anasmohammed/speech-accent-classifiermodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Phonetics and Phonology Research
