Multiple Choice Learning of Low-Rank Adapters for Language Modeling
Victor Letzelter, Hugo Malard, Mathieu Fontaine, Ga\"el Richard, Slim Essid, Andrei Bursuc, Patrick P\'erez

TL;DR
This paper introduces LoRA-MCL, a novel training scheme combining Multiple Choice Learning and Low-Rank Adaptation to generate diverse and plausible sentence continuations in language models, addressing the ill-posed nature of language prediction.
Contribution
It presents a new method that extends language models with MCL and Low-Rank Adaptation, providing a theoretical framework and practical implementation for diverse text generation.
Findings
Achieves high diversity and relevance in generated outputs
Effectively handles ambiguity in language modeling
Demonstrates improvements in visual/audio captioning and translation
Abstract
We propose LoRA-MCL, a training scheme that extends next-token prediction in language models with a method designed to decode diverse, plausible sentence continuations at inference time. Traditional language modeling is an intrinsically ill-posed problem: given a context, multiple ``futures'' may be equally plausible. Our approach leverages Multiple Choice Learning (MCL) and the Winner-Takes-All loss to efficiently handle ambiguity through Low-Rank Adaptation. We provide a theoretical interpretation of applying MCL to language modeling, assuming the data is generated from a mixture of distributions. We illustrate the proposed approach using mixtures of Markov chains. We then demonstrate with experiments on visual and audio captioning, as well as machine translation, that our method achieves high diversity and relevance in generated outputs. The accompanying code and a general-purpose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
