Approches quantitatives de l'analyse des pr{\'e}dictions en traduction automatique neuronale (TAN)
Maria Zimina-Poirot (CLILLAC-ARP), Nicolas Ballier (CLILLAC-ARP),, Jean-Baptiste Yun\`es (IRIF)

TL;DR
This paper investigates the training phases of neural machine translation models, revealing non-linear progressions and emphasizing the importance of chronological phenomena in understanding translation quality development.
Contribution
It introduces quantitative methods to analyze training phases in neural machine translation, highlighting the significance of chronological progression in model development.
Findings
Training progression is not always linear.
Chronological phenomena significantly influence translation quality.
Quantitative analysis reveals distinct phases in NMT training.
Abstract
As part of a larger project on optimal learning conditions in neural machine translation, we investigate characteristic training phases of translation engines. All our experiments are carried out using OpenNMT-Py: the pre-processing step is implemented using the Europarl training corpus and the INTERSECT corpus is used for validation. Longitudinal analyses of training phases suggest that the progression of translations is not always linear. Following the results of textometric explorations, we identify the importance of the phenomena related to chronological progression, in order to map different processes at work in neural machine translation (NMT).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
