Multi-View Feature Representation for Dialogue Generation with   Bidirectional Distillation

Shaoxiong Feng; Xuancheng Ren; Kan Li; Xu Sun

arXiv:2102.10780·cs.CL·February 23, 2021

Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation

Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu Sun

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel bidirectional distillation framework for dialogue generation that promotes the learning of common, general knowledge among multiple students, improving response quality and model generalization.

Contribution

It proposes a multi-view feature representation method with bidirectional distillation, enabling students to learn shared knowledge and enhance generalization in dialogue models.

Findings

01

Improved response quality in dialogue generation tasks.

02

Enhanced model generalization without increased training cost.

03

Effective knowledge sharing among students through bidirectional distillation.

Abstract

Neural dialogue models suffer from low-quality responses when interacted in practice, demonstrating difficulty in generalization beyond training data. Recently, knowledge distillation has been used to successfully regularize the student by transferring knowledge from the teacher. However, the teacher and the student are trained on the same dataset and tend to learn similar feature representations, whereas the most general knowledge should be found through differences. The finding of general knowledge is further hindered by the unidirectional distillation, as the student should obey the teacher and may discard some knowledge that is truly general but refuted by the teacher. To this end, we propose a novel training framework, where the learning of general knowledge is more in line with the idea of reaching consensus, i.e., finding common knowledge that is beneficial to different yet all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation· underline

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems

MethodsKnowledge Distillation