Unsupervised Text Style Transfer with Padded Masked Language Models

Eric Malmi; Aliaksei Severyn; Sascha Rothe

arXiv:2010.01054·cs.CL·October 5, 2020

Unsupervised Text Style Transfer with Padded Masked Language Models

Eric Malmi, Aliaksei Severyn, Sascha Rothe

PDF

TL;DR

This paper introduces Masker, an unsupervised method for text style transfer that uses masked language models to identify and modify style-specific text spans without requiring parallel data, achieving competitive results.

Contribution

The paper presents Masker, a novel unsupervised style transfer approach utilizing masked language models and a padded MLM variant to improve text editing without parallel data.

Findings

01

Performs competitively in style transfer tasks.

02

Improves supervised methods' accuracy in low-resource settings.

03

Effective in sentence fusion and sentiment transfer.

Abstract

We propose Masker, an unsupervised text-editing method for style transfer. To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms of likelihood. This allows us to identify the source tokens to delete to transform the source text to match the style of the target domain. The deleted tokens are replaced with the target MLM, and by using a padded MLM variant, we avoid having to predetermine the number of inserted tokens. Our experiments on sentence fusion and sentiment transfer demonstrate that Masker performs competitively in a fully unsupervised setting. Moreover, in low-resource settings, it improves supervised methods' accuracy by over 10 percentage points when pre-training them on silver training data generated by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.