Towards a Human-like Open-Domain Chatbot
Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah, Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng, Lu, Quoc V. Le

TL;DR
This paper introduces Meena, a large neural network chatbot trained on social media data, and proposes a new human evaluation metric, SSA, demonstrating its effectiveness in measuring human-like conversation quality.
Contribution
The paper presents Meena, a 2.6B parameter open-domain chatbot trained end-to-end, and introduces SSA, a novel human evaluation metric for multi-turn conversations.
Findings
Strong correlation between perplexity and SSA scores.
Meena achieves 72% SSA on multi-turn evaluation.
Full Meena scores 79% SSA, outperforming existing chatbots.
Abstract
We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. We also propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of a human-like multi-turn conversation. Our experiments show strong correlation between perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72% on multi-turn evaluation) suggests that a human-level SSA of 86% is potentially within reach if we can better optimize perplexity. Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79% SSA, 23% higher in absolute SSA than the existing chatbots we evaluated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗t-bank-ai/ruDialoGPT-smallmodel· 1.9k dl· ♡ 101.9k dl♡ 10
- 🤗t-bank-ai/ruDialoGPT-mediummodel· 912 dl· ♡ 36912 dl♡ 36
- 🤗cmarkea/bloomz-7b1-mt-sft-chatmodel· 773 dl· ♡ 16773 dl♡ 16
- 🤗cmarkea/bloomz-3b-sft-chatmodel· 805 dl· ♡ 12805 dl♡ 12
- 🤗cmarkea/bloomz-560m-sft-chatmodel· 923 dl· ♡ 10923 dl♡ 10
- 🤗RichardErkhov/cmarkea_-_bloomz-560m-sft-chat-4bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/cmarkea_-_bloomz-560m-sft-chat-8bitsmodel· 1 dl1 dl
- 🤗RichardErkhov/cmarkea_-_bloomz-7b1-mt-sft-chat-ggufmodel· 24 dl24 dl
- 🤗RichardErkhov/cmarkea_-_bloomz-3b-sft-chat-ggufmodel· 27 dl27 dl
- 🤗RichardErkhov/cmarkea_-_bloomz-3b-sft-chat-4bitsmodel· 1 dl1 dl
Videos
Google’s Chatbot: Almost Perfect 🤖· youtube
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsMeena
