# Phoneme Level Language Models for Sequence Based Low Resource ASR

**Authors:** Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze

arXiv: 1902.07613 · 2019-02-21

## TL;DR

This paper introduces a phoneme-level multilingual language model that efficiently adapts to low-resource languages, achieving comparable or better performance than traditional methods with fewer parameters.

## Contribution

The paper presents a novel phoneme-level language model for multilingual and crosslingual low-resource ASR, demonstrating improved adaptation and decoding performance over existing approaches.

## Key findings

- Achieves similar performance to monolingual models with six times fewer parameters.
- Outperforms WFST decoding in low-resource and domain mismatch scenarios.
- Enables effective crosslingual adaptation for low-resource languages.

## Abstract

Building multilingual and crosslingual models help bring different languages together in a language universal space. It allows models to share parameters and transfer knowledge across languages, enabling faster and better adaptation to a new language. These approaches are particularly useful for low resource languages. In this paper, we propose a phoneme-level language model that can be used multilingually and for crosslingual adaptation to a target language. We show that our model performs almost as well as the monolingual models by using six times fewer parameters, and is capable of better adaptation to languages not seen during training in a low resource scenario. We show that these phoneme-level language models can be used to decode sequence based Connectionist Temporal Classification (CTC) acoustic model outputs to obtain comparable word error rates with Weighted Finite State Transducer (WFST) based decoding in Babel languages. We also show that these phoneme-level language models outperform WFST decoding in various low-resource conditions like adapting to a new language and domain mismatch between training and testing data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.07613/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1902.07613/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1902.07613/full.md

---
Source: https://tomesphere.com/paper/1902.07613