A Static and Dynamic Attention Framework for Multi Turn Dialogue   Generation

Wei-Nan Zhang; Yiming Cui; Kaiyan Zhang; Yifa Wang; Qingfu Zhu,; Lingzhi Li; Ting Liu

arXiv:2410.20766·cs.CL·October 29, 2024

A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation

Wei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu,, Lingzhi Li, Ting Liu

PDF

TL;DR

This paper introduces a static and dynamic attention framework for multi-turn dialogue generation, effectively modeling dialogue history to improve response quality in open domain systems.

Contribution

It proposes a novel static and dynamic attention-based approach to better encode dialogue history, addressing RNN vanishing gradient issues in multi-turn dialogue modeling.

Findings

01

Outperforms previous models on Ubuntu and Opensubtitles datasets.

02

Improves automatic and human evaluation metrics.

03

Combining static and dynamic attention enhances response quality.

Abstract

Recently, research on open domain dialogue systems have attracted extensive interests of academic and industrial researchers. The goal of an open domain dialogue system is to imitate humans in conversations. Previous works on single turn conversation generation have greatly promoted the research of open domain dialogue systems. However, understanding multiple single turn conversations is not equal to the understanding of multi turn dialogue due to the coherent and context dependent properties of human dialogue. Therefore, in open domain multi turn dialogue generation, it is essential to modeling the contextual semantics of the dialogue history, rather than only according to the last utterance. Previous research had verified the effectiveness of the hierarchical recurrent encoder-decoder framework on open domain multi turn dialogue generation. However, using RNN-based model to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.