Wizard of Wikipedia: Knowledge-Powered Conversational agents
Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli,, Jason Weston

TL;DR
This paper introduces a large dataset and new models for open-domain dialogue that effectively incorporate Wikipedia knowledge, enabling more informed and natural conversations.
Contribution
It provides a new dataset, benchmark, and architectures for knowledge-grounded dialogue, advancing the development of more knowledgeable conversational agents.
Findings
Models can conduct knowledgeable discussions evaluated by automatic metrics.
The dataset enables measuring improvements in knowledge integration.
Best models generate more natural and informed responses.
Abstract
In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date. The most popular sequence to sequence models typically "generate and hope" generic utterances that can be memorized in the weights of the model when mapping from input utterance(s) to output, rather than employing recalled knowledge as context. Use of knowledge has so far proved difficult, in part because of the lack of a supervised learning benchmark task which exhibits knowledgeable open dialogue with clear grounding. To that end we collect and release a large dataset with conversations directly grounded with knowledge retrieved from Wikipedia. We then design architectures capable of retrieving knowledge, reading and conditioning on it, and finally generating natural responses. Our best performing dialogue models are able to conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
