An Annotation Scheme of A Large-scale Multi-party Dialogues Dataset for Discourse Parsing and Machine Comprehension
Jiaqi Li, Ming Liu, Bing Qin, Zihao Zheng, Ting Liu

TL;DR
This paper introduces a large-scale annotated dataset for discourse parsing and machine comprehension in multi-party dialogues, based on the Ubuntu Chat Corpus, enabling better understanding of complex multi-party conversations.
Contribution
It presents the first large-scale corpus for multi-party dialogue discourse parsing and proposes a new task for machine reading comprehension in this context.
Findings
First large-scale multi-party dialogue discourse parsing corpus
Annotated discourse structures and question-answer pairs
Introduced a novel multi-party dialogue comprehension task
Abstract
In this paper, we propose the scheme for annotating large-scale multi-party chat dialogues for discourse parsing and machine comprehension. The main goal of this project is to help understand multi-party dialogues. Our dataset is based on the Ubuntu Chat Corpus. For each multi-party dialogue, we annotate the discourse structure and question-answer pairs for dialogues. As we know, this is the first large scale corpus for multi-party dialogues discourse parsing, and we firstly propose the task for multi-party dialogues machine reading comprehension.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
