# MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech   Recognition

**Authors:** Yoohwan Kwon, Soo-Whan Chung

arXiv: 2302.13750 · 2023-02-28

## TL;DR

This paper introduces MoLE, a multi-lingual speech recognition model that uses language-specific experts and a lightweight tokenizer to improve recognition accuracy across multiple languages, especially low-resource ones.

## Contribution

The paper proposes a novel Mixture-of-Language-Expert architecture with a language tokenizer that activates language-specific experts and estimates their reliability for enhanced multi-lingual speech recognition.

## Key findings

- Improved recognition accuracy in multi-lingual scenarios.
- Enhanced performance on low-resource languages.
- Effective language discrimination and embedding aggregation.

## Abstract

Multi-lingual speech recognition aims to distinguish linguistic expressions in different languages and integrate acoustic processing simultaneously. In contrast, current multi-lingual speech recognition research follows a language-aware paradigm, mainly targeted to improve recognition performance rather than discriminate language characteristics. In this paper, we present a multi-lingual speech recognition network named Mixture-of-Language-Expert(MoLE), which digests speech in a variety of languages. Specifically, MoLE analyzes linguistic expression from input speech in arbitrary languages, activating a language-specific expert with a lightweight language tokenizer. The tokenizer not only activates experts, but also estimates the reliability of the activation. Based on the reliability, the activated expert and the language-agnostic expert are aggregated to represent language-conditioned embedding for efficient speech recognition. Our proposed model is evaluated in 5 languages scenario, and the experimental results show that our structure is advantageous on multi-lingual recognition, especially for speech in low-resource language.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13750/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13750/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/2302.13750/full.md

---
Source: https://tomesphere.com/paper/2302.13750