Looking for Confirmations: An Effective and Human-Like Visual Dialogue   Strategy

Alberto Testoni; Raffaella Bernardi

arXiv:2109.05312·cs.CL·September 14, 2021

Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy

Alberto Testoni, Raffaella Bernardi

PDF

Open Access 1 Repo

TL;DR

This paper introduces Confirm-it, a new model for visual dialogue that uses a beam search re-ranking algorithm to generate more natural and goal-oriented questions, improving over existing methods in the GuessWhat?! game.

Contribution

The paper presents Confirm-it, a novel approach that incorporates a re-ranking strategy to produce more human-like and effective visual dialogue questions.

Findings

01

Dialogues are more natural and human-like.

02

Questions are more effective in identifying the referent.

03

Improved performance over baseline beam search methods.

Abstract

Generating goal-oriented questions in Visual Dialogue tasks is a challenging and long-standing problem. State-Of-The-Art systems are shown to generate questions that, although grammatically correct, often lack an effective strategy and sound unnatural to humans. Inspired by the cognitive literature on information search and cross-situational word learning, we design Confirm-it, a model based on a beam search re-ranking algorithm that guides an effective goal-oriented strategy by asking questions that confirm the model's conjecture about the referent. We take the GuessWhat?! game as a case-study. We show that dialogues generated by Confirm-it are more natural and effective than beam search decoding without re-ranking.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

albertotestoni/confirm_it
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Speech and dialogue systems · Topic Modeling