The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu, Xiaoqian Liu, Xiaowen Liu, Laohu Wang, Canan Huang, Tong, Xiao, Jingbo Zhu

TL;DR
This paper presents the NiuTrans end-to-end speech translation system for IWSLT 2021, utilizing advanced Transformer-based models and data augmentation to directly translate English speech to German text, achieving high BLEU scores.
Contribution
The paper introduces an end-to-end speech translation system with enhanced Transformer architecture and ensemble decoding, demonstrating improved performance on IWSLT 2021 tasks.
Findings
Achieved 33.84 BLEU on MuST-C En-De test set.
Enhanced model architecture with Conformer and relative position encoding.
Effective data augmentation through translation of transcriptions.
Abstract
This paper describes the submission of the NiuTrans end-to-end speech translation system for the IWSLT 2021 offline task, which translates from the English audio to German text directly without intermediate transcription. We use the Transformer-based model architecture and enhance it by Conformer, relative position encoding, and stacked acoustic and textual encoding. To augment the training data, the English transcriptions are translated to German translations. Finally, we employ ensemble decoding to integrate the predictions from several models trained with the different datasets. Combining these techniques, we achieve 33.84 BLEU points on the MuST-C En-De test set, which shows the enormous potential of the end-to-end model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Music and Audio Processing
