Deep Transformer based Data Augmentation with Subword Units for   Morphologically Rich Online ASR

Bal\'azs Tarj\'an; Gy\"orgy Szasz\'ak; Tibor Fegy\'o; P\'eter Mihajlik

arXiv:2007.06949·eess.AS·November 5, 2020·1 cites

Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR

Bal\'azs Tarj\'an, Gy\"orgy Szasz\'ak, Tibor Fegy\'o, P\'eter Mihajlik

PDF

Open Access

TL;DR

This paper introduces a subword-based neural text augmentation method for morphologically rich languages in online ASR, improving WER and reducing vocabulary size by fine-tuning Transformer language models with subword units.

Contribution

It proposes a novel subword-based neural text augmentation technique that enhances language modeling for morphologically rich languages in online ASR systems.

Findings

01

Subword augmentation significantly reduces vocabulary size.

02

Subword augmentation improves WER and OOV word recognition.

03

Both Morfessor and BPE subword methods are effective.

Abstract

Recently Deep Transformer models have proven to be particularly powerful in language modeling tasks for ASR. Their high complexity, however, makes them very difficult to apply in the first (single) pass of an online system. Recent studies showed that a considerable part of the knowledge of neural network Language Models (LM) can be transferred to traditional n-grams by using neural text generation based data augmentation. In our paper, we pre-train a GPT-2 Transformer LM on a general text corpus and fine-tune it on our Hungarian conversational call center ASR task. We show that although data augmentation with Transformer-generated text works well for isolating languages, it causes a vocabulary explosion in a morphologically rich language. Therefore, we propose a new method called subword-based neural text augmentation, where we retokenize the generated text into statistically derived…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Cosine Annealing · Linear Warmup With Cosine Annealing · Attention Dropout · Discriminative Fine-Tuning · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention