Asking Multimodal Clarifying Questions in Mixed-Initiative   Conversational Search

Yifei Yuan; Clemencia Siro; Mohammad Aliannejadi; Maarten de Rijke,; Wai Lam

arXiv:2402.07742·cs.CL·February 13, 2024·2 cites

Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search

Yifei Yuan, Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke,, Wai Lam

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel multimodal approach to improve clarifying questions in conversational search by incorporating images, leading to significant performance gains and a new dataset for research.

Contribution

It proposes the task of asking multimodal clarifying questions, introduces the Melon dataset, and develops the Marto model with a prompt-based training strategy.

Findings

01

Adding images improves retrieval performance by up to 90%.

02

Marto outperforms discriminative baselines in effectiveness and efficiency.

03

The dataset Melon contains over 4,000 multimodal questions with 14,000 images.

Abstract

In mixed-initiative conversational search systems, clarifying questions are used to help users who struggle to express their intentions in a single query. These questions aim to uncover user's information needs and resolve query ambiguities. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. Therefore, we propose to add images to clarifying questions and formulate the novel task of asking multimodal clarifying questions in open-domain, mixed-initiative conversational search systems. To facilitate research into this task, we collect a dataset named Melon that contains over 4k multimodal clarifying questions, enriched with over 14k images. We also propose a multimodal query clarification model named Marto and adopt a prompt-based, generative fine-tuning strategy to perform the training of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yfyuan01/mqc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems