An Annotation Scheme of A Large-scale Multi-party Dialogues Dataset for   Discourse Parsing and Machine Comprehension

Jiaqi Li; Ming Liu; Bing Qin; Zihao Zheng; Ting Liu

arXiv:1911.03514·cs.CL·November 12, 2019

An Annotation Scheme of A Large-scale Multi-party Dialogues Dataset for Discourse Parsing and Machine Comprehension

Jiaqi Li, Ming Liu, Bing Qin, Zihao Zheng, Ting Liu

PDF

Open Access

TL;DR

This paper introduces a large-scale annotated dataset for discourse parsing and machine comprehension in multi-party dialogues, based on the Ubuntu Chat Corpus, enabling better understanding of complex multi-party conversations.

Contribution

It presents the first large-scale corpus for multi-party dialogue discourse parsing and proposes a new task for machine reading comprehension in this context.

Findings

01

First large-scale multi-party dialogue discourse parsing corpus

02

Annotated discourse structures and question-answer pairs

03

Introduced a novel multi-party dialogue comprehension task

Abstract

In this paper, we propose the scheme for annotating large-scale multi-party chat dialogues for discourse parsing and machine comprehension. The main goal of this project is to help understand multi-party dialogues. Our dataset is based on the Ubuntu Chat Corpus. For each multi-party dialogue, we annotate the discourse structure and question-answer pairs for dialogues. As we know, this is the first large scale corpus for multi-party dialogues discourse parsing, and we firstly propose the task for multi-party dialogues machine reading comprehension.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems