# M2H-GAN: A GAN-based Mapping from Machine to Human Transcripts for   Speech Understanding

**Authors:** Titouan Parcollet, Mohamed Morchid, Xavier Bost, Georges Linar\`es

arXiv: 1905.01957 · 2019-05-07

## TL;DR

This paper introduces M2H-GAN, a novel GAN-based method that generates human-like transcripts from machine transcriptions to enhance speech understanding accuracy in telephone conversations.

## Contribution

It presents a new GAN architecture that distills human transcript knowledge into automatic transcriptions, improving classification performance in spoken language understanding tasks.

## Key findings

- Improved theme identification accuracy using generated transcripts.
- Effective distillation of human transcript features into machine transcriptions.
- Demonstrated benefits in real-world telephone conversation analysis.

## Abstract

Deep learning is at the core of recent spoken language understanding (SLU) related tasks. More precisely, deep neural networks (DNNs) drastically increased the performances of SLU systems, and numerous architectures have been proposed. In the real-life context of theme identification of telephone conversations, it is common to hold both a human, manual (TRS) and an automatically transcribed (ASR) versions of the conversations. Nonetheless, and due to production constraints, only the ASR transcripts are considered to build automatic classifiers. TRS transcripts are only used to measure the performances of ASR systems. Moreover, the recent performances in term of classification accuracy, obtained by DNN related systems are close to the performances reached by humans, and it becomes difficult to further increase the performances by only considering the ASR transcripts. This paper proposes to distillates the TRS knowledge available during the training phase within the ASR representation, by using a new generative adversarial network called M2H-GAN to generate a TRS-like version of an ASR document, to improve the theme identification performances.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01957/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01957/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1905.01957/full.md

---
Source: https://tomesphere.com/paper/1905.01957