Pretrained Language Models for Dialogue Generation with Multiple Input   Sources

Yu Cao; Wei Bi; Meng Fang; Dacheng Tao

arXiv:2010.07576·cs.CL·October 16, 2020·1 cites

Pretrained Language Models for Dialogue Generation with Multiple Input Sources

Yu Cao, Wei Bi, Meng Fang, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper investigates methods to adapt pretrained GPT-2 models for dialogue generation involving multiple input sources, focusing on effective fusion techniques to improve response relevance.

Contribution

It introduces novel fusion methods for multiple input sources in pretrained language models, enhancing dialogue response relevance over simple concatenation or averaging.

Findings

01

Proper fusion methods outperform simple baselines

02

Fusion improves relevance with dialogue history

03

Exploration of multiple attention fusion techniques

Abstract

Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses conditioned on multiple sources. Previous work simply concatenates all input sources or averages information from different input sources. In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2. We explore various methods to fuse multiple separate attention information corresponding to different sources. Our experimental results show that proper fusion methods deliver higher relevance with dialogue history than simple fusion baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

caoyu-noob/Multi-GPT2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems