Predicting Above-Sentence Discourse Structure using Distant Supervision   from Topic Segmentation

Patrick Huber; Linzi Xing; Giuseppe Carenini

arXiv:2112.06196·cs.CL·December 14, 2021

Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation

Patrick Huber, Linzi Xing, Giuseppe Carenini

PDF

1 Video

TL;DR

This paper introduces a method for predicting discourse structures above sentences by leveraging distant supervision from topic segmentation, improving accuracy over previous models in NLP discourse parsing tasks.

Contribution

It extends distant supervision techniques to high-level discourse structure prediction using topic segmentation, addressing data scarcity in discourse parsing.

Findings

01

Outperforms previous distantly supervised models on sentence-to-document tasks

02

Achieves higher scores on sentence-to-paragraph level in experiments

03

Generates accurate discourse tree structures at sentence and paragraph levels

Abstract

RST-style discourse parsing plays a vital role in many NLP tasks, revealing the underlying semantic/pragmatic structure of potentially complex and diverse documents. Despite its importance, one of the most prevailing limitations in modern day discourse parsing is the lack of large-scale datasets. To overcome the data sparsity issue, distantly supervised approaches from tasks like sentiment analysis and summarization have been recently proposed. Here, we extend this line of research by exploiting distant supervision from topic segmentation, which can arguably provide a strong and oftentimes complementary signal for high-level discourse structures. Experiments on two human-annotated discourse treebanks confirm that our proposal generates accurate tree structures on sentence and paragraph level, consistently outperforming previous distantly supervised models on the sentence-to-document…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Predicting Above-Sentence Discourse Structure using Distant Supervision from Topic Segmentation· underline