Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy
Alberto Testoni, Raffaella Bernardi

TL;DR
This paper introduces Confirm-it, a new model for visual dialogue that uses a beam search re-ranking algorithm to generate more natural and goal-oriented questions, improving over existing methods in the GuessWhat?! game.
Contribution
The paper presents Confirm-it, a novel approach that incorporates a re-ranking strategy to produce more human-like and effective visual dialogue questions.
Findings
Dialogues are more natural and human-like.
Questions are more effective in identifying the referent.
Improved performance over baseline beam search methods.
Abstract
Generating goal-oriented questions in Visual Dialogue tasks is a challenging and long-standing problem. State-Of-The-Art systems are shown to generate questions that, although grammatically correct, often lack an effective strategy and sound unnatural to humans. Inspired by the cognitive literature on information search and cross-situational word learning, we design Confirm-it, a model based on a beam search re-ranking algorithm that guides an effective goal-oriented strategy by asking questions that confirm the model's conjecture about the referent. We take the GuessWhat?! game as a case-study. We show that dialogues generated by Confirm-it are more natural and effective than beam search decoding without re-ranking.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Topic Modeling
