# Asking Clarifying Questions in Open-Domain Information-Seeking   Conversations

**Authors:** Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, W. Bruce, Croft

arXiv: 1907.06554 · 2019-07-16

## TL;DR

This paper introduces a new task and dataset for asking clarifying questions in open-domain conversational information retrieval, demonstrating significant performance improvements through a specialized retrieval framework.

## Contribution

It formulates the task of asking clarifying questions in open-domain conversations, proposes an evaluation methodology, and develops a retrieval framework that outperforms baselines.

## Key findings

- Asking one good question improves retrieval performance by over 170%.
- The proposed retrieval framework outperforms competitive baselines.
- The Qulac dataset contains over 10K question-answer pairs for 198 TREC topics.

## Abstract

Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s). In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06554/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06554/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/1907.06554/full.md

---
Source: https://tomesphere.com/paper/1907.06554