Asking More Informative Questions for Grounded Retrieval

Sedrick Keh; Justin T. Chiu; Daniel Fried

arXiv:2311.08584·cs.CL·November 16, 2023·1 cites

Asking More Informative Questions for Grounded Retrieval

Sedrick Keh, Justin T. Chiu, Daniel Fried

PDF

Open Access 1 Video

TL;DR

This paper introduces a method for grounded image identification that asks more informative, open-ended questions, effectively handling presupposition errors in VQA models, leading to improved accuracy and efficiency.

Contribution

It proposes a novel approach to formulate open-ended questions and incorporate presupposition handling into question selection and belief updates in grounded retrieval tasks.

Findings

01

Increased accuracy by 14% over previous state-of-the-art.

02

Achieved 48% more efficient games in human evaluations.

03

Effectively manages presupposition errors in VQA models.

Abstract

When a model is trying to gather information in an interactive setting, it benefits from asking informative questions. However, in the case of a grounded multi-turn image identification task, previous studies have been constrained to polar yes/no questions, limiting how much information the model can gain in a single turn. We present an approach that formulates more informative, open-ended questions. In doing so, we discover that off-the-shelf visual question answering (VQA) models often make presupposition errors, which standard information gain question selection methods fail to account for. To address this issue, we propose a method that can incorporate presupposition handling into both question selection and belief updates. Specifically, we use a two-stage process, where the model first filters out images which are irrelevant to a given question, then updates its beliefs about which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Asking More Informative Questions for Grounded Retrieval· underline

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques