TL;DR
This paper introduces two innovative unsupervised approaches for removing toxicity from text using large pre-trained neural models, achieving state-of-the-art results in style transfer for toxicity mitigation.
Contribution
The paper proposes two novel unsupervised methods for text detoxification, combining style-guided generation and synonym replacement with BERT, along with the first large-scale comparative evaluation.
Findings
Both methods achieve new state-of-the-art results.
The style-guided paraphrasing effectively reduces toxicity.
BERT-based word replacement improves flexibility and effectiveness.
Abstract
We present two novel unsupervised methods for eliminating toxicity in text. Our first method combines two recent ideas: (1) guidance of the generation process with small style-conditional language models and (2) use of paraphrasing models to perform style transfer. We use a well-performing paraphraser guided by style-trained language models to keep the text content and remove toxicity. Our second method uses BERT to replace toxic words with their non-offensive synonyms. We make the method more flexible by enabling BERT to replace mask tokens with a variable number of words. Finally, we present the first large-scale comparative study of style transfer models on the task of toxicity removal. We compare our models with a number of methods for style transfer. The models are evaluated in a reference-free way using a combination of unsupervised style transfer metrics. Both methods we suggest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · WordPiece · Layer Normalization · Dense Connections · Attention Dropout · Multi-Head Attention · Linear Warmup With Linear Decay
