SYSTRAN's Pure Neural Machine Translation Systems
Josep Crego, Jungi Kim, Guillaume Klein, Anabel Rebollo, Kathy Yang,, Jean Senellart, Egor Akhanov, Patrice Brunelle, Aurelien Coquard, Yongchao, Deng, Satoshi Enoue, Chiyo Geiss, Joshua Johanson, Ardas Khalsa, Raoum, Khiari, Byeongil Ko, Catherine Kobus, Jean Lorieux

TL;DR
This paper details SYSTRAN's development of production-ready neural machine translation systems across multiple languages, emphasizing practical choices, evaluation, and collaborative efforts to advance industry adoption.
Contribution
It presents a comprehensive approach to building and deploying large-scale NMT systems, including architecture, data handling, tuning, and evaluation, for real-world production use.
Findings
Successful deployment of NMT systems for 12 languages and 32 language pairs.
Evaluation methodology and initial performance insights shared.
Framework and practices aimed at industry adoption and further research.
Abstract
Since the first online demonstration of Neural Machine Translation (NMT) by LISA, NMT development has recently moved from laboratory to production systems as demonstrated by several entities announcing roll-out of NMT engines to replace their existing technologies. NMT systems have a large number of training configurations and the training process of such systems is usually very long, often a few weeks, so role of experimentation is critical and important to share. In this work, we present our approach to production-ready systems simultaneously with release of online demonstrators covering a large variety of languages (12 languages, for 32 language pairs). We explore different practical choices: an efficient and evolutive open-source framework; data preparation; network architecture; additional implemented features; tuning for production; etc. We discuss about evaluation methodology,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research
