Self-conditioning pre-trained language models

Xavier Suau; Luca Zappella; Nicholas Apostoloff

arXiv:2110.02802·cs.CL·June 16, 2023·1 cites

Self-conditioning pre-trained language models

Xavier Suau, Luca Zappella, Nicholas Apostoloff

PDF

Open Access 1 Repo

TL;DR

This paper investigates how pre-trained Transformer language models generate text by identifying and activating specific expert units responsible for concepts, enabling controlled generation and bias correction without additional training.

Contribution

It introduces a method to identify and activate expert units in TLMs for concept conditioning, advancing understanding of their internal mechanisms and enabling bias mitigation.

Findings

01

Small number of units can steer generation

02

Effective bias correction without fine-tuning

03

Method achieves gender parity with lower perplexity

Abstract

In this paper we aim to investigate the mechanisms that guide text generation with pre-trained Transformer-based Language Models (TLMs). Grounded on the Product of Experts formulation by Hinton (1999), we describe a generative mechanism that exploits expert units which naturally exist in TLMs. Such units are responsible for detecting concepts in the input and conditioning text generation on such concepts. We describe how to identify expert units and how to activate them during inference in order to induce any desired concept in the generated output. We find that the activation of a surprisingly small amount of units is sufficient to steer text generation (as little as 3 units in a model with 345M parameters). While the objective of this work is to learn more about how TLMs work, we show that our method is effective for conditioning without fine-tuning or using extra parameters, even on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

apple/ml-selfcond
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis