Revision Transformers: Instructing Language Models to Change their Values
Felix Friedrich, Wolfgang Stammer, Patrick Schramowski, Kristian, Kersting

TL;DR
This paper introduces the Revision Transformer (RiT), a new method for efficiently updating large language models' knowledge and values, especially moral concepts, through user-guided revisions without extensive retraining.
Contribution
The paper proposes RiT, a novel framework combining large pre-trained LMs with a structured revision engine for easy, user-interactive model updates, addressing bias and value changes.
Findings
RiT enables effective model updates with minimal data.
User feedback improves model's moral knowledge revision.
RiT demonstrates strong performance on moral dataset revisions.
Abstract
Current transformer language models (LM) are large-scale models with billions of parameters. They have been shown to provide high performances on a variety of tasks but are also prone to shortcut learning and bias. Addressing such incorrect model behavior via parameter adjustments is very costly. This is particularly problematic for updating dynamic concepts, such as moral values, which vary culturally or interpersonally. In this work, we question the current common practice of storing all information in the model parameters and propose the Revision Transformer (RiT) to facilitate easy model updating. The specific combination of a large-scale pre-trained LM that inherently but also diffusely encodes world knowledge with a clear-structured revision engine makes it possible to update the model's knowledge with little effort and the help of user interaction. We exemplify RiT on a moral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Adam · Label Smoothing · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding
