An Empirical Investigation of Pre-Trained Transformer Language Models   for Open-Domain Dialogue Generation

Piji Li

arXiv:2003.04195·cs.CL·March 10, 2020·21 cites

An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation

Piji Li

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates pre-trained Transformer language models for open-domain dialogue generation, analyzing their performance across multiple datasets, languages, and decoding strategies to improve relevance and diversity.

Contribution

It introduces a joint prediction paradigm for context and response, and provides comprehensive experimental analysis of various models and decoding methods in dialogue generation.

Findings

01

Transformer models outperform baselines in relevance and diversity

02

Joint prediction improves response quality

03

Decoding strategies significantly affect generated responses

Abstract

We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter learning. Corpora of News and Wikipedia in Chinese and English are collected for the pre-training stage respectively. Dialogue context and response are concatenated into a single sequence utilized as the input of the models during the fine-tuning stage. A weighted joint prediction paradigm for both context and response is designed to evaluate the performance of models with or without the loss term for context prediction. Various of decoding strategies such as greedy search, beam search, top-k sampling, etc. are employed to conduct the response text generation. Extensive experiments are conducted on the typical single-turn and multi-turn dialogue…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lipiji/Guyu
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems