Predicting Discourse Structure using Distant Supervision from Sentiment
Patrick Huber, Giuseppe Carenini

TL;DR
This paper introduces a novel method for discourse parsing that leverages distant supervision from sentiment analysis to generate training data, enabling better cross-domain performance in predicting discourse structures.
Contribution
It proposes a neural multiple-instance learning approach combined with CKY-style tree generation for discourse parsing using automatically labeled data from sentiment analysis.
Findings
Parser trained on automatically generated data performs well in cross-domain settings.
The approach outperforms traditional methods in inter-domain discourse structure prediction.
It demonstrates the potential of distant supervision for reducing reliance on annotated datasets.
Abstract
Discourse parsing could not yet take full advantage of the neural NLP revolution, mostly due to the lack of annotated datasets. We propose a novel approach that uses distant supervision on an auxiliary task (sentiment classification), to generate abundant data for RST-style discourse structure prediction. Our approach combines a neural variant of multiple-instance learning, using document-level supervision, with an optimal CKY-style tree generation algorithm. In a series of experiments, we train a discourse parser (for only structure prediction) on our automatically generated dataset and compare it with parsers trained on human-annotated corpora (news domain RST-DT and Instructional domain). Results indicate that while our parser does not yet match the performance of a parser trained and tested on the same dataset (intra-domain), it does perform remarkably well on the much more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
