RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang, Arash Ahmadian, Kelly Marchisio, Julia Kreutzer, Ahmet, \"Ust\"un, Sara Hooker

TL;DR
This paper advances multilingual large language model alignment by introducing a scalable feedback data generation method, achieving state-of-the-art performance across 23 languages and demonstrating the benefits of cross-lingual transfer and larger datasets.
Contribution
The authors develop a novel scalable method for multilingual feedback data generation and demonstrate its effectiveness in improving multilingual LLM alignment and performance.
Findings
Achieved 54.4% win-rate against Aya 23 8B, the current multilingual SOTA.
Expanded alignment techniques to 23 languages covering half the world's population.
Showed benefits of cross-lingual transfer and larger datasets in preference training.
Abstract
Preference optimization techniques have become a standard final stage for training state-of-art large language models (LLMs). However, despite widespread adoption, the vast majority of work to-date has focused on first-class citizen languages like English and Chinese. This captures a small fraction of the languages in the world, but also makes it unclear which aspects of current state-of-the-art research transfer to a multilingual setting. In this work, we perform an exhaustive study to achieve a new state-of-the-art in aligning multilingual LLMs. We introduce a novel, scalable method for generating high-quality multilingual feedback data to balance data coverage. We establish the benefits of cross-lingual transfer and increased dataset size in preference training. Our preference-trained model achieves a 54.4% win-rate against Aya 23 8B, the current state-of-the-art multilingual LLM in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗CohereLabs/aya-expanse-8bmodel· 16k dl· ♡ 42316k dl♡ 423
- 🤗CohereLabs/aya-expanse-32bmodel· 6.7k dl· ♡ 2896.7k dl♡ 289
- 🤗jth01/aya-expanse-8b-5.0bpw-exl2model· 2 dl2 dl
- 🤗lucyknada/CohereForAI_aya-expanse-8b-exl2model· ♡ 2♡ 2
- 🤗duyntnet/aya-expanse-8b-imatrix-GGUFmodel· 47 dl47 dl
- 🤗lucyknada/CohereForAI_aya-expanse-32b-exl2model· ♡ 2♡ 2
- 🤗Andrewwwwww/aya-expanse-32bmodel· 3 dl3 dl
- 🤗Svngoku/Aya-Expanse-8B-Frenchmodel· 2 dl2 dl
- 🤗QuantFactory/aya-expanse-8b-GGUFmodel· 194 dl· ♡ 5194 dl♡ 5
- 🤗duyntnet/aya-expanse-32b-imatrix-GGUFmodel· 62 dl62 dl
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Translation Studies and Practices
