PREADD: Prefix-Adaptive Decoding for Controlled Text Generation
Jonathan Pei, Kevin Yang, and Dan Klein

TL;DR
PREADD introduces a flexible, external-model-free method for controlled text generation by linearly combining logits from different prompts, effectively managing attributes like toxicity, bias, and sentiment.
Contribution
It presents a novel prefix-adaptive decoding technique that enables attribute control without auxiliary models, outperforming existing prompting and control methods.
Findings
Outperforms prompting baselines by over 12% on main metrics.
Effectively reduces toxicity, gender bias, and controls sentiment.
Does not require external auxiliary models for control.
Abstract
We propose Prefix-Adaptive Decoding (PREADD), a flexible method for controlled text generation. Unlike existing methods that use auxiliary expert models to control for attributes, PREADD does not require an external model, instead relying on linearly combining output logits from multiple prompts. Specifically, PREADD contrasts the output logits generated using a raw prompt against those generated using a prefix-prepended prompt, enabling both positive and negative control with respect to any attribute encapsulated by the prefix. We evaluate PREADD on three tasks -- toxic output mitigation, gender bias reduction, and sentiment control -- and find that PREADD outperforms not only prompting baselines, but also an auxiliary-expert control method, by 12% or more in relative gain on our main metrics for each task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Natural Language Processing Techniques
