Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure
Xueliang Zhao, Lemao Liu, Tingchen Fu, Shuming Shi, Dongyan Zhao and, Rui Yan

TL;DR
This paper introduces a lightweight, interpretable dialogue generation model with transferable latent structure, achieving better responses and faster inference than large models, validated on benchmark datasets.
Contribution
It presents a novel, efficient, and interpretable latent structure for dialogue generation that transfers well across domains, unlike existing large, opaque models.
Findings
Outperforms four strong baselines in automatic and human evaluations.
Achieves 5x speedup with only 22% of parameters of the strongest baseline.
Provides explainability through interpretation of discrete latent variables.
Abstract
With the availability of massive general-domain dialogue data, pre-trained dialogue generation appears to be super appealing to transfer knowledge from the general domain to downstream applications. In most existing work, such transferable ability is mainly obtained by fitting a large model with hundreds of millions of parameters on massive data in an exhaustive way, leading to inefficient running and poor interpretability. This paper proposes a novel dialogue generation model with a latent structure that is easily transferable from the general domain to downstream tasks in a lightweight and transparent way. Experiments on two benchmarks validate the effectiveness of the proposed model. Thanks to the transferable latent structure, our model is able to yield better dialogue responses than four strong baselines in terms of both automatic and human evaluations, and our model with about 22%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
