Growing Together: Modeling Human Language Learning With n-Best   Multi-Checkpoint Machine Translation

El Moatez Billah Nagoudi; Muhammad Abdul-Mageed; Hasan Cavusoglu

arXiv:2006.04050·cs.CL·June 9, 2020

Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation

El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, Hasan Cavusoglu

PDF

TL;DR

This paper presents a novel approach to machine translation by ensemble modeling across multiple training checkpoints to mimic human language learning stages, improving translation quality.

Contribution

It introduces a multi-checkpoint ensemble method that leverages different training stages to enhance translation fluency and accuracy in language learning models.

Findings

01

Achieved 37.57 macro F1 score on English-Portuguese translation

02

Outperformed baseline Amazon translation system with 21.30 macro F1

03

Demonstrated the effectiveness of multi-checkpoint ensemble approach

Abstract

We describe our submission to the 2020 Duolingo Shared Task on Simultaneous Translation And Paraphrase for Language Education (STAPLE) (Mayhew et al., 2020). We view MT models at various training stages (i.e., checkpoints) as human learners at different levels. Hence, we employ an ensemble of multi-checkpoints from the same model to generate translation sequences with various levels of fluency. From each checkpoint, for our best model, we sample n-Best sequences (n=10) with a beam width =100. We achieve 37.57 macro F1 with a 6 checkpoint model ensemble on the official English to Portuguese shared task test data, outperforming a baseline Amazon translation system of 21.30 macro F1 and ultimately demonstrating the utility of our intuitive method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.