Dialectal Speech Recognition and Translation of Swiss German Speech to   Standard German Text: Microsoft's Submission to SwissText 2021

Yuriy Arabskyy; Aashish Agarwal; Subhadeep Dey; Oscar Koller

arXiv:2106.08126·eess.AS·July 2, 2021·5 cites

Dialectal Speech Recognition and Translation of Swiss German Speech to Standard German Text: Microsoft's Submission to SwissText 2021

Yuriy Arabskyy, Aashish Agarwal, Subhadeep Dey, Oscar Koller

PDF

Open Access

TL;DR

This paper presents Microsoft's winning approach for Swiss German speech recognition and translation, combining hybrid models, transfer learning, and neural language models to effectively convert Swiss German dialects into standard German text, achieving high BLEU scores.

Contribution

The paper introduces a hybrid ASR system with translation-aware lexicon, transfer-learned acoustic models, and neural language models tailored for Swiss German dialects, advancing dialect recognition and translation.

Findings

01

Achieved 46.04% BLEU score on blind test set

02

Outperformed second place by 12% relative margin

03

Effective handling of Swiss German dialectal features

Abstract

This paper describes the winning approach in the Shared Task 3 at SwissText 2021 on Swiss German Speech to Standard German Text, a public competition on dialect recognition and translation. Swiss German refers to the multitude of Alemannic dialects spoken in the German-speaking parts of Switzerland. Swiss German differs significantly from standard German in pronunciation, word inventory and grammar. It is mostly incomprehensible to native German speakers. Moreover, it lacks a standardized written script. To solve the challenging task, we propose a hybrid automatic speech recognition system with a lexicon that incorporates translations, a 1st pass language model that deals with Swiss German particularities, a transfer-learned acoustic model and a strong neural language model for 2nd pass rescoring. Our submission reaches 46.04% BLEU on a blind conversational test set and outperforms the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Speech and dialogue systems