Controlling Conditional Language Models without Catastrophic Forgetting
Tomasz Korbak, Hady Elsahar, German Kruszewski, Marc Dymetman

TL;DR
This paper introduces Conditional DPG (CDPG), a method to adapt pretrained language models for specific control tasks without losing their general capabilities, demonstrated across multiple tasks and models.
Contribution
The paper extends distributional policy gradients to conditional tasks, enabling controlled adaptation of pretrained models without catastrophic forgetting.
Findings
CDPG effectively aligns models with control objectives
It prevents catastrophic forgetting during fine-tuning
Demonstrated on translation, summarization, and code generation
Abstract
Machine learning is shifting towards general-purpose pretrained generative models, trained in a self-supervised manner on large amounts of data, which can then be applied to solve a large number of tasks. However, due to their generic training methodology, these models often fail to meet some of the downstream requirements (e.g., hallucinations in abstractive summarization or style violations in code generation). This raises the important question of how to adapt pre-trained generative models to meet all requirements without destroying their general capabilities ("catastrophic forgetting"). Recent work has proposed to solve this problem by representing task-specific requirements through energy-based models (EBMs) and approximating these EBMs using distributional policy gradients (DPG). Despite its effectiveness, this approach is however limited to unconditional distributions. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Gated Linear Unit · Byte Pair Encoding · Inverse Square Root Schedule · Adafactor
