Prior Attention for Style-aware Sequence-to-Sequence Models
Lucas Sterckx, Johannes Deleu, Chris Develder, Thomas Demeester

TL;DR
This paper introduces a method to control the style and attributes of sequence-to-sequence model outputs by generating attention matrices from a latent space, enabling targeted output characteristics like length and lexical simplification.
Contribution
It presents a novel approach combining variational auto-encoders with attention mechanisms to steer sequence generation towards desired attributes.
Findings
Controlled output length and lexical simplification demonstrated.
Latent space sampling enables attribute-specific output tuning.
Method improves flexibility in style-aware sequence generation.
Abstract
We extend sequence-to-sequence models with the possibility to control the characteristics or style of the generated output, via attention that is generated a priori (before decoding) from a latent code vector. After training an initial attention-based sequence-to-sequence model, we use a variational auto-encoder conditioned on representations of input sequences and a latent code vector space to generate attention matrices. By sampling the code vector from specific regions of this latent space during decoding and imposing prior attention generated from it in the seq2seq model, output can be steered towards having certain attributes. This is demonstrated for the task of sentence simplification, where the latent code vector allows control over output length and lexical simplification, and enables fine-tuning to optimize for different evaluation metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
