Revision Transformers: Instructing Language Models to Change their   Values

Felix Friedrich; Wolfgang Stammer; Patrick Schramowski; Kristian; Kersting

arXiv:2210.10332·cs.CL·July 26, 2023

Revision Transformers: Instructing Language Models to Change their Values

Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian, Kersting

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Revision Transformer (RiT), a new method for efficiently updating large language models' knowledge and values, especially moral concepts, through user-guided revisions without extensive retraining.

Contribution

The paper proposes RiT, a novel framework combining large pre-trained LMs with a structured revision engine for easy, user-interactive model updates, addressing bias and value changes.

Findings

01

RiT enables effective model updates with minimal data.

02

User feedback improves model's moral knowledge revision.

03

RiT demonstrates strong performance on moral dataset revisions.

Abstract

Current transformer language models (LM) are large-scale models with billions of parameters. They have been shown to provide high performances on a variety of tasks but are also prone to shortcut learning and bias. Addressing such incorrect model behavior via parameter adjustments is very costly. This is particularly problematic for updating dynamic concepts, such as moral values, which vary culturally or interpersonally. In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) to facilitate easy model updating. The specific combination of a large-scale pre-trained LM that inherently but also diffusely encodes world knowledge with a clear-structured revision engine makes it possible to update the model's knowledge with little effort and the help of user interaction. We exemplify RiT on a moral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ml-research/revision-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Adam · Label Smoothing · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding