DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett,, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan

TL;DR
This paper introduces DialoGPT, a large-scale pre-trained transformer model for conversational response generation trained on Reddit data, achieving near-human performance and improving response relevance and consistency in dialogue systems.
Contribution
The paper presents DialoGPT, a novel large-scale pre-trained model specifically designed for open-domain dialogue, with publicly available training pipeline and model.
Findings
DialoGPT achieves near-human performance in single-turn dialogue.
Responses generated are more relevant and contextually consistent.
The model outperforms strong baseline systems in automatic and human evaluations.
Abstract
We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Leostronkest/DialoGPTmodel· 4 dl4 dl
- 🤗s-nlp/roberta_toxicity_classifier_v1model· 666 dl666 dl
- 🤗s-nlp/t5-paranmt-detoxmodel· 68 dl· ♡ 568 dl♡ 5
- 🤗s-nlp/t5-paraphrase-paws-msrp-opinosis-paranmtmodel· 627 dl627 dl
- 🤗microsoft/DialoGPT-largemodel· 3.9k dl· ♡ 2883.9k dl♡ 288
- 🤗microsoft/DialoGPT-mediummodel· 331k dl· ♡ 434331k dl♡ 434
- 🤗microsoft/DialoGPT-smallmodel· 57k dl· ♡ 14457k dl♡ 144
- 🤗ncoop57/DiGPTame-mediummodel· 10 dl· ♡ 210 dl♡ 2
- 🤗shalpin87/dialoGPT-homer-simpsonmodel· 3 dl3 dl
- 🤗ScyKindness/Hatsune_Mikumodel· 9 dl· ♡ 19 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
