Pretrained Language Models for Dialogue Generation with Multiple Input Sources
Yu Cao, Wei Bi, Meng Fang, Dacheng Tao

TL;DR
This paper investigates methods to adapt pretrained GPT-2 models for dialogue generation involving multiple input sources, focusing on effective fusion techniques to improve response relevance.
Contribution
It introduces novel fusion methods for multiple input sources in pretrained language models, enhancing dialogue response relevance over simple concatenation or averaging.
Findings
Proper fusion methods outperform simple baselines
Fusion improves relevance with dialogue history
Exploration of multiple attention fusion techniques
Abstract
Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses conditioned on multiple sources. Previous work simply concatenates all input sources or averages information from different input sources. In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2. We explore various methods to fuse multiple separate attention information corresponding to different sources. Our experimental results show that proper fusion methods deliver higher relevance with dialogue history than simple fusion baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
