Multi-Sentence Knowledge Selection in Open-Domain Dialogue
Mihail Eric, Nicole Chartier, Behnam Hedayatnia, Karthik, Gopalakrishnan, Pankaj Rajan, Yang Liu, Dilek Hakkani-Tur

TL;DR
This paper evaluates current open-domain dialogue knowledge selection methods, identifies flaws, and introduces WOW++, an augmented dataset with multiple relevant knowledge sentences, to improve knowledge ranking algorithms.
Contribution
It proposes a new framework for collecting relevant knowledge and creates WOW++, a dataset with multiple knowledge sentences per context, enhancing evaluation and training.
Findings
Neural rerankers using WOW++ outperform those trained on standard datasets.
Existing methodologies have flaws in data and evaluation for knowledge selection.
WOW++ averages 8 relevant knowledge sentences per dialogue, capturing ambiguity.
Abstract
Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task (Dinan et al., 2019), such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsWizard: Unsupervised goats tracking algorithm
