Focus-Constrained Attention Mechanism for CVAE-based Response Generation
Zhi Cui, Yanran Li, Jiayi Zhang, Jianwei Cui, Chen Wei, Bin Wang

TL;DR
This paper introduces a focus-constrained attention mechanism for CVAE-based response generation, transforming coarse discourse information into fine-grained signals to improve response diversity and informativeness.
Contribution
It proposes a novel focus-constrained attention mechanism that leverages fine-grained focus signals to enhance response quality in CVAE-based models.
Findings
Generated responses are more diverse and informative.
The model outperforms several state-of-the-art baselines.
Fine-grained focus signals improve alignment and response quality.
Abstract
To model diverse responses for a given post, one promising way is to introduce a latent variable into Seq2Seq models. The latent variable is supposed to capture the discourse-level information and encourage the informativeness of target responses. However, such discourse-level information is often too coarse for the decoder to be utilized. To tackle it, our idea is to transform the coarse-grained discourse-level information into fine-grained word-level information. Specifically, we firstly measure the semantic concentration of corresponding target response on the post words by introducing a fine-grained focus signal. Then, we propose a focus-constrained attention mechanism to take full advantage of focus in well aligning the input to the target response. The experimental results demonstrate that by exploiting the fine-grained signal, our model can generate more diverse and informative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
