# Challenging the Boundaries of Speech Recognition: The MALACH Corpus

**Authors:** Michael Picheny, Z\'oltan T\"uske, Brian Kingsbury, Kartik Audhkhasi,, Xiaodong Cui, George Saon

arXiv: 1908.03455 · 2019-08-12

## TL;DR

This paper introduces the MALACH corpus, a challenging dataset of Holocaust testimonies with natural speech variabilities, aiming to advance robust speech recognition systems capable of handling accents, disfluencies, and emotional speech.

## Contribution

The paper presents the MALACH corpus, baseline results with current deep learning methods, and resources to facilitate research on robust speech recognition for complex, real-world speech data.

## Key findings

- Baseline deep learning results on MALACH corpus.
- Resources including lexicon and setup released for community use.
- Highlighting the need for more robust speech recognition systems.

## Abstract

There has been huge progress in speech recognition over the last several years. Tasks once thought extremely difficult, such as SWITCHBOARD, now approach levels of human performance. The MALACH corpus (LDC catalog LDC2012S05), a 375-Hour subset of a large archive of Holocaust testimonies collected by the Survivors of the Shoah Visual History Foundation, presents significant challenges to the speech community. The collection consists of unconstrained, natural speech filled with disfluencies, heavy accents, age-related coarticulations, un-cued speaker and language switching, and emotional speech - all still open problems for speech recognition systems. Transcription is challenging even for skilled human annotators. This paper proposes that the community place focus on the MALACH corpus to develop speech recognition systems that are more robust with respect to accents, disfluencies and emotional speech. To reduce the barrier for entry, a lexicon and training and testing setups have been created and baseline results using current deep learning technologies are presented. The metadata has just been released by LDC (LDC2019S11). It is hoped that this resource will enable the community to build on top of these baselines so that the extremely important information in these and related oral histories becomes accessible to a wider audience.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.03455/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1908.03455/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1908.03455/full.md

---
Source: https://tomesphere.com/paper/1908.03455