Prior Attention for Style-aware Sequence-to-Sequence Models

Lucas Sterckx; Johannes Deleu; Chris Develder; Thomas Demeester

arXiv:1806.09439·cs.CL·June 26, 2018

Prior Attention for Style-aware Sequence-to-Sequence Models

Lucas Sterckx, Johannes Deleu, Chris Develder, Thomas Demeester

PDF

Open Access

TL;DR

This paper introduces a method to control the style and attributes of sequence-to-sequence model outputs by generating attention matrices from a latent space, enabling targeted output characteristics like length and lexical simplification.

Contribution

It presents a novel approach combining variational auto-encoders with attention mechanisms to steer sequence generation towards desired attributes.

Findings

01

Controlled output length and lexical simplification demonstrated.

02

Latent space sampling enables attribute-specific output tuning.

03

Method improves flexibility in style-aware sequence generation.

Abstract

We extend sequence-to-sequence models with the possibility to control the characteristics or style of the generated output, via attention that is generated a priori (before decoding) from a latent code vector. After training an initial attention-based sequence-to-sequence model, we use a variational auto-encoder conditioned on representations of input sequences and a latent code vector space to generate attention matrices. By sampling the code vector from specific regions of this latent space during decoding and imposing prior attention generated from it in the seq2seq model, output can be steered towards having certain attributes. This is demonstrated for the task of sentence simplification, where the latent code vector allows control over output length and lexical simplification, and enables fine-tuning to optimize for different evaluation metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence