Towards Efficient Dialogue Pre-training with Transferable and   Interpretable Latent Structure

Xueliang Zhao; Lemao Liu; Tingchen Fu; Shuming Shi; Dongyan Zhao and; Rui Yan

arXiv:2210.12461·cs.CL·October 25, 2022

Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure

Xueliang Zhao, Lemao Liu, Tingchen Fu, Shuming Shi, Dongyan Zhao and, Rui Yan

PDF

Open Access

TL;DR

This paper introduces a lightweight, interpretable dialogue generation model with transferable latent structure, achieving better responses and faster inference than large models, validated on benchmark datasets.

Contribution

It presents a novel, efficient, and interpretable latent structure for dialogue generation that transfers well across domains, unlike existing large, opaque models.

Findings

01

Outperforms four strong baselines in automatic and human evaluations.

02

Achieves 5x speedup with only 22% of parameters of the strongest baseline.

03

Provides explainability through interpretation of discrete latent variables.

Abstract

With the availability of massive general-domain dialogue data, pre-trained dialogue generation appears to be super appealing to transfer knowledge from the general domain to downstream applications. In most existing work, such transferable ability is mainly obtained by fitting a large model with hundreds of millions of parameters on massive data in an exhaustive way, leading to inefficient running and poor interpretability. This paper proposes a novel dialogue generation model with a latent structure that is easily transferable from the general domain to downstream tasks in a lightweight and transparent way. Experiments on two benchmarks validate the effectiveness of the proposed model. Thanks to the transferable latent structure, our model is able to yield better dialogue responses than four strong baselines in terms of both automatic and human evaluations, and our model with about 22%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications