ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task
Ha Nguyen, Natalia Tomashenko, Marcely Zanon Boito, Antoine, Caubriere, Fethi Bougares, Mickael Rouvier, Laurent Besacier and, Yannick Esteve

TL;DR
This paper presents the ON-TRAC Consortium's end-to-end neural speech translation systems for English-to-Portuguese, analyzing various training and input segmentation strategies, and comparing performance to traditional pipeline methods.
Contribution
The paper introduces a unified end-to-end speech translation model and evaluates the effects of corpus pooling, tokenization, and segmentation on translation quality.
Findings
End-to-end model achieved BLEU scores of 26.91 (MuST-C) and 43.82 (How2).
Pooling heterogeneous corpora impacts translation performance.
Comparison shows end-to-end approach is competitive with pipeline methods.
Abstract
This paper describes the ON-TRAC Consortium translation systems developed for the end-to-end model task of IWSLT Evaluation 2019 for the English-to-Portuguese language pair. ON-TRAC Consortium is composed of researchers from three French academic laboratories: LIA (Avignon Universit\'e), LIG (Universit\'e Grenoble Alpes), and LIUM (Le Mans Universit\'e). A single end-to-end model built as a neural encoder-decoder architecture with attention mechanism was used for two primary submissions corresponding to the two EN-PT evaluations sets: (1) TED (MuST-C) and (2) How2. In this paper, we notably investigate impact of pooling heterogeneous corpora for training, impact of target tokenization (characters or BPEs), impact of speech input segmentation and we also compare our best end-to-end model (BLEU of 26.91 on MuST-C and 43.82 on How2 validation sets) to a pipeline (ASR+MT) approach.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Text Readability and Simplification
