Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry, Haddow, Kenneth Heafield

TL;DR
This paper compares monolingual and multilingual instruction tuning of large language models, finding that multilingual approaches are cost-effective and can outperform monolingual tuning in certain scenarios, especially with downsampled data.
Contribution
It provides an empirical analysis of multilingual instruction tuning strategies using Alpaca data, highlighting cost efficiency and robustness advantages.
Findings
Multilingual tuning matches or exceeds monolingual performance.
Downsampled multilingual data can be as effective and more robust.
Multilingual tuning is cost-efficient under fixed computation budgets.
Abstract
Foundational large language models (LLMs) can be instruction-tuned to perform open-domain question answering, facilitating applications like chat assistants. While such efforts are often carried out in a single language, we empirically analyze cost-efficient strategies for multilingual scenarios. Our study employs the Alpaca dataset and machine translations of it to form multilingual data, which is then used to tune LLMs through either low-rank adaptation or full-parameter training. Under a controlled computation budget, comparisons show that multilingual tuning is on par or better than tuning a model for each language. Furthermore, multilingual tuning with downsampled data can be as powerful and more robust. Our findings serve as a guide for expanding language support through instruction tuning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗dice-research/lola_v1_alpaca_instructions_multilingualmodel· 5 dl5 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-cs-bloom-560m-ggufmodel· 183 dl183 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-de-bloom-560m-ggufmodel· 172 dl172 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-fr-bloom-560m-ggufmodel· 128 dl128 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-en-bloom-560m-ggufmodel· 22 dl22 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-es-bloom-560m-ggufmodel· 17 dl17 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-fi-bloom-560m-ggufmodel· 81 dl81 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-ru-bloom-560m-ggufmodel· 26 dl26 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-zh-bloom-560m-ggufmodel· 13 dl13 dl
- 🤗RichardErkhov/HPLT_-_sft-fpft-cs-bloom-1b1-ggufmodel· 42 dl42 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
