A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer
Chen Wu, Xuancheng Ren, Fuli Luo, Xu Sun

TL;DR
This paper introduces a hierarchical reinforcement learning approach called PTO for unsupervised text style transfer, improving interpretability, content preservation, and style-content trade-off handling.
Contribution
The paper proposes a novel hierarchical reinforced sequence operation method with a high-level position proposal and low-level sentence editing, enhancing unsupervised text style transfer.
Findings
Outperforms recent methods on style transfer datasets
Effectively balances content preservation and style control
Enables multi-step revision with a single-step trained model
Abstract
Unsupervised text style transfer aims to alter text styles while preserving the content, without aligned data for supervision. Existing seq2seq methods face three challenges: 1) the transfer is weakly interpretable, 2) generated outputs struggle in content preservation, and 3) the trade-off between content and style is intractable. To address these challenges, we propose a hierarchical reinforced sequence operation method, named Point-Then-Operate (PTO), which consists of a high-level agent that proposes operation positions and a low-level agent that alters the sentence. We provide comprehensive training objectives to control the fluency, style, and content of the outputs and a mask-based inference algorithm that allows for multi-step revision based on the single-step trained agents. Experimental results on two text style transfer datasets show that our method significantly outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
