Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation
Florian Mai, James Henderson

TL;DR
This paper introduces Bag-of-Vectors Autoencoders (BoV-AEs) for unsupervised conditional text generation, enabling longer text encoding and improved attribute transfer by extending existing embedding mapping methods.
Contribution
It extends Emb2Emb to variable-length bag-of-vectors autoencoders, allowing longer text encoding and more effective attribute manipulation in the latent space.
Findings
Outperforms standard autoencoders in sentiment transfer tasks
Enables encoding of longer texts than traditional autoencoders
Improves the quality of unsupervised attribute transfer
Abstract
Text autoencoders are often used for unsupervised conditional text generation by applying mappings in the latent space to change attributes to the desired values. Recently, Mai et al. (2020) proposed Emb2Emb, a method to learn these mappings in the embedding space of an autoencoder. However, their method is restricted to autoencoders with a single-vector embedding, which limits how much information can be retained. We address this issue by extending their method to Bag-of-Vectors Autoencoders (BoV-AEs), which encode the text into a variable-size bag of vectors that grows with the size of the text, as in attention-based models. This allows to encode and reconstruct much longer texts than standard autoencoders. Analogous to conventional autoencoders, we propose regularization techniques that facilitate learning meaningful operations in the latent space. Finally, we adapt Emb2Emb for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
