Multi-Type Conversational Question-Answer Generation with Closed-ended   and Unanswerable Questions

Seonjeong Hwang; Yunsu Kim; Gary Geunbae Lee

arXiv:2210.12979·cs.CL·October 25, 2022·1 cites

Multi-Type Conversational Question-Answer Generation with Closed-ended and Unanswerable Questions

Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee

PDF

Open Access

TL;DR

This paper presents a novel data synthesis framework for conversational question answering that generates diverse question types, including unanswerable questions, improving system performance across multiple domains.

Contribution

The paper introduces a unified framework for synthesizing multi-type CQA data with hierarchical answerability classification, enhancing data quality and system robustness.

Findings

01

Synthetic data closely resembles human conversations.

02

CQA systems trained on synthetic data perform comparably to those trained on real data.

03

Effective generation of open-ended, closed-ended, and unanswerable questions.

Abstract

Conversational question answering (CQA) facilitates an incremental and interactive understanding of a given context, but building a CQA system is difficult for many domains due to the problem of data scarcity. In this paper, we introduce a novel method to synthesize data for CQA with various question types, including open-ended, closed-ended, and unanswerable questions. We design a different generation flow for each question type and effectively combine them in a single, shared framework. Moreover, we devise a hierarchical answerability classification (hierarchical AC) module that improves quality of the synthetic data while acquiring unanswerable questions. Manual inspections show that synthetic data generated with our framework have characteristics very similar to those of human-generated conversations. Across four domains, CQA systems trained on our synthetic data indeed show good…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems