TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue

TL;DR
This paper presents TransferTransfo, a transfer learning approach using Transformer models and multi-task fine-tuning that significantly improves neural conversational agents' performance on dialogue benchmarks.
Contribution
It introduces a novel transfer learning method with multi-task fine-tuning for Transformer-based dialogue systems, achieving state-of-the-art results.
Findings
Achieved a perplexity of 16.28 on PERSONA-CHAT
Attained 80.7% Hits@1 accuracy
Reached 19.5 F1 score, outperforming previous models
Abstract
We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
Methods12 Easy Ways to Talk to a Real Person at Spirit Airlines Customer Service · Ten Ways To Talk To Someone At JetBlue: A Step by Step Guide · Ten Ways to Contact: How Can I Speak to Someone at Breeze Airways – A Step-by-Step Guide · 15 Ways to Contact How Do I Talk to Someone at Breeze : A Step-by-Step Guide · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Sigmoid Activation · Tanh Activation · Residual Connection
