On Instruction-Finetuning Neural Machine Translation Models

Vikas Raunak; Roman Grundkiewicz; Marcin Junczys-Dowmunt

arXiv:2410.05553·cs.CL·October 10, 2024

On Instruction-Finetuning Neural Machine Translation Models

Vikas Raunak, Roman Grundkiewicz, Marcin Junczys-Dowmunt

PDF

Open Access

TL;DR

This paper introduces instruction finetuning for neural machine translation models, enabling them to follow multiple instructions and perform diverse translation tasks efficiently, similar to large language models.

Contribution

It presents a novel instruction finetuning method for NMT models that allows multi-task, multi-modal, and zero-shot instruction following capabilities.

Findings

01

NMT models can follow multiple instructions simultaneously.

02

Instruction finetuning enables diverse translation tasks to be handled jointly.

03

Performance is comparable to large language models like GPT-3.5-Turbo.

Abstract

In this work, we introduce instruction finetuning for Neural Machine Translation (NMT) models, which distills instruction following capabilities from Large Language Models (LLMs) into orders-of-magnitude smaller NMT models. Our instruction-finetuning recipe for NMT models enables customization of translations for a limited but disparate set of translation-specific tasks. We show that NMT models are capable of following multiple instructions simultaneously and demonstrate capabilities of zero-shot composition of instructions. We also show that through instruction finetuning, traditionally disparate tasks such as formality-controlled machine translation, multi-domain adaptation as well as multi-modal translations can be tackled jointly by a single instruction finetuned NMT model, at a performance level comparable to LLMs such as GPT-3.5-Turbo. To the best of our knowledge, our work is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Sparse Evolutionary Training · Linear Layer · Residual Connection · Weight Decay · Cosine Annealing · Dropout · Byte Pair Encoding