Conditional Language Learning with Context

Xiao Zhang; Miao Li; Ji Wu

arXiv:2406.01976·cs.CL·June 5, 2024

Conditional Language Learning with Context

Xiao Zhang, Miao Li, Ji Wu

PDF

Open Access 1 Repo

TL;DR

This paper introduces conditional finetuning for language models, enabling them to learn task-relevant knowledge while avoiding irrelevant corpus biases, thereby improving stability and lifelong learning capabilities.

Contribution

It proposes a simple modification to causal language modeling that conditions on context, allowing selective learning and reducing unwanted biases during finetuning.

Findings

01

Conditional finetuning reduces topic bias learning.

02

Models exhibit less forgetting and better stability.

03

Improved downstream task performance.

Abstract

Language models can learn sophisticated language understanding skills from fitting raw text. They also unselectively learn useless corpus statistics and biases, especially during finetuning on domain-specific corpora. In this paper, we propose a simple modification to causal language modeling called conditional finetuning, which performs language modeling conditioned on a context. We show that a context can "explain away" certain corpus statistics and make the model avoid learning them. In this fashion, conditional finetuning achieves selective learning from a corpus, learning knowledge useful for downstream tasks while avoiding learning useless corpus statistics like topic biases. This selective learning effect leads to less forgetting and better stability-plasticity tradeoff in domain finetuning, potentially benefitting lifelong learning with language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaozeroone/conditional_finetune
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques