TopiOCQA: Open-domain Conversational Question Answering with Topic Switching
Vaibhav Adlakha, Shehzaad Dhuliawala, Kaheer Suleman, Harm de Vries,, Siva Reddy

TL;DR
TopiOCQA introduces a novel open-domain conversational dataset with topic switches, challenging models to perform multi-turn retrieval and response generation, advancing research in conversational question answering.
Contribution
The paper presents TopiOCQA, a new dataset featuring topic switches in open-domain conversations, and evaluates baseline models, highlighting the task's complexity.
Findings
Best model achieves F1 of 55.8
Models lag 14.2 points behind humans
Dataset enables research on multi-turn retrieval and response generation
Abstract
In a conversational question answering scenario, a questioner seeks to extract information about a topic through a series of interdependent questions and answers. As the conversation progresses, they may switch to related topics, a phenomenon commonly observed in information-seeking search sessions. However, current datasets for conversational question answering are limiting in two ways: 1) they do not contain topic switches; and 2) they assume the reference text for the conversation is given, i.e., the setting is not open-domain. We introduce TopiOCQA (pronounced Tapioca), an open-domain conversational dataset with topic switches on Wikipedia. TopiOCQA contains 3,920 conversations with information-seeking questions and free-form answers. On average, a conversation in our dataset spans 13 question-answer turns and involves four topics (documents). TopiOCQA poses a challenging test-bed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
