DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common Label Set
Arunkumar A, Mudit Batra, Umesh S

TL;DR
This paper introduces DuDe, a novel dual-decoder architecture for multilingual Indian language ASR that leverages a common label set and machine transliteration to improve recognition across diverse scripts and sounds.
Contribution
It proposes a new Encoder-Decoder-Decoder architecture utilizing common label sets and native scripts, enhancing multilingual ASR for Indian languages.
Findings
CLS-based models improve recognition accuracy
Dual-decoder architecture outperforms single-decoder models
Machine transliteration enhances multilingual system performance
Abstract
In a multilingual country like India, multilingual Automatic Speech Recognition (ASR) systems have much scope. Multilingual ASR systems exhibit many advantages like scalability, maintainability, and improved performance over the monolingual ASR systems. However, building multilingual systems for Indian languages is challenging since different languages use different scripts for writing. On the other hand, Indian languages share a lot of common sounds. Common Label Set (CLS) exploits this idea and maps graphemes of various languages with similar sounds to common labels. Since Indian languages are mostly phonetic, building a parser to convert from native script to CLS is easy. In this paper, we explore various approaches to build multilingual ASR models. We also propose a novel architecture called Encoder-Decoder-Decoder for building multilingual systems that use both CLS and native…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and Audio Processing
