MALM: Mixing Augmented Language Modeling for Zero-Shot Machine   Translation

Kshitij Gupta

arXiv:2210.00320·cs.CL·October 4, 2022

MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation

Kshitij Gupta

PDF

Open Access

TL;DR

This paper introduces MALM, a method that enhances zero-shot multilingual machine translation by combining prompt conditioning, self-supervised pre-training, and data augmentation, reducing off-target language errors.

Contribution

It demonstrates that prompt conditioned large models effectively mitigate off-target language errors in zero-shot translation, leveraging self-supervised pre-training and data augmentation.

Findings

01

Prompt conditioned models do not suffer from off-target language errors.

02

Self-supervised pre-training improves zero-shot translation quality.

03

Data augmentation enhances multilingual translation performance.

Abstract

Large pre-trained language models have brought remarkable progress in NLP. Pre-training and Fine-tuning have given state-of-art performance across tasks in text processing. Data Augmentation techniques have also helped build state-of-art models on low or zero resource tasks. Many works in the past have attempted at learning a single massively-multilingual machine translation model for zero-shot translation. Although those translation models are producing correct translations, the main challenge is those models are producing the wrong languages for zero-shot translation. This work and its results indicate that prompt conditioned large models do not suffer from off-target language errors i.e. errors arising due to translation to wrong languages. We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications