On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak, Hady Elsahar, Germ\'an Kruszewski, Marc, Dymetman

TL;DR
This paper explores the theoretical links between Reward Maximization and Distribution Matching in fine-tuning language models, demonstrating how concepts like baselines improve training stability and efficiency without catastrophic forgetting.
Contribution
It establishes a theoretical connection between RM and DM paradigms and introduces baseline techniques to enhance DM methods for stable, efficient language model fine-tuning.
Findings
Baseline integration improves training stability.
Distribution matching achieves better constraint satisfaction.
Enhanced sample efficiency in controllable language generation.
Abstract
The availability of large pre-trained models is changing the landscape of Machine Learning research and practice, moving from a training-from-scratch to a fine-tuning paradigm. While in some applications the goal is to "nudge" the pre-trained distribution towards preferred outputs, in others it is to steer it towards a different distribution over the sample space. Two main paradigms have emerged to tackle this challenge: Reward Maximization (RM) and, more recently, Distribution Matching (DM). RM applies standard Reinforcement Learning (RL) techniques, such as Policy Gradients, to gradually increase the reward signal. DM prescribes to first make explicit the target distribution that the model is fine-tuned to approximate. Here we explore the theoretical connections between the two paradigms, and show that methods such as KL-control developed for RM can also be construed as belonging to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
