Controlling Conditional Language Models without Catastrophic Forgetting

Tomasz Korbak; Hady Elsahar; German Kruszewski; Marc Dymetman

arXiv:2112.00791·cs.LG·June 22, 2022·6 cites

Controlling Conditional Language Models without Catastrophic Forgetting

Tomasz Korbak, Hady Elsahar, German Kruszewski, Marc Dymetman

PDF

Open Access 2 Repos

TL;DR

This paper introduces Conditional DPG (CDPG), a method to adapt pretrained language models for specific control tasks without losing their general capabilities, demonstrated across multiple tasks and models.

Contribution

The paper extends distributional policy gradients to conditional tasks, enabling controlled adaptation of pretrained models without catastrophic forgetting.

Findings

01

CDPG effectively aligns models with control objectives

02

It prevents catastrophic forgetting during fine-tuning

03

Demonstrated on translation, summarization, and code generation

Abstract

Machine learning is shifting towards general-purpose pretrained generative models, trained in a self-supervised manner on large amounts of data, which can then be applied to solve a large number of tasks. However, due to their generic training methodology, these models often fail to meet some of the downstream requirements (e.g., hallucinations in abstractive summarization or style violations in code generation). This raises the important question of how to adapt pre-trained generative models to meet all requirements without destroying their general capabilities ("catastrophic forgetting"). Recent work has proposed to solve this problem by representing task-specific requirements through energy-based models (EBMs) and approximating these EBMs using distributional policy gradients (DPG). Despite its effectiveness, this approach is however limited to unconditional distributions. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Gated Linear Unit · Byte Pair Encoding · Inverse Square Root Schedule · Adafactor