Low-Resource Machine Translation through the Lens of Personalized   Federated Learning

Viktor Moskvoretskii; Nazarii Tupitsa; Chris Biemann; Samuel; Horv\'ath; Eduard Gorbunov; Irina Nikishina

arXiv:2406.12564·cs.CL·December 23, 2024

Low-Resource Machine Translation through the Lens of Personalized Federated Learning

Viktor Moskvoretskii, Nazarii Tupitsa, Chris Biemann, Samuel, Horv\'ath, Eduard Gorbunov, Irina Nikishina

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces MeritOpt, a personalized federated learning method for low-resource machine translation that effectively handles heterogeneous data, is interpretable, and demonstrates promising results on Southeast Asian and Finno-Ugric languages.

Contribution

It proposes MeritOpt, a novel personalized federated learning approach tailored for low-resource translation tasks, with enhanced interpretability and minimal interference from unrelated languages.

Findings

01

Target dataset size influences weight distribution.

02

Unrelated languages do not interfere with training.

03

Auxiliary optimizer parameters have minimal impact.

Abstract

We present a new approach called MeritOpt based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the datasets of South East Asian and Finno-Ugric languages. In addition to its effectiveness, MeritOpt is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritOpt.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Low-Resource Machine Translation through the Lens of Personalized Federated Learning· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Recommender Systems and Techniques