Unsupervised Learning of Discourse Structures using a Tree Autoencoder

Patrick Huber; Giuseppe Carenini

arXiv:2012.09446·cs.CL·December 18, 2020

Unsupervised Learning of Discourse Structures using a Tree Autoencoder

Patrick Huber, Giuseppe Carenini

PDF

Open Access

TL;DR

This paper introduces an unsupervised, autoencoder-based method for generating discourse trees, aiming to improve the robustness and diversity of discourse structures across domains for better NLP task performance.

Contribution

It presents a novel unsupervised approach to discourse tree induction that is task-agnostic and capable of producing larger, more diverse discourse treebanks.

Findings

01

Effective inference of discourse trees across multiple domains

02

Improved quality and diversity of discourse structures

03

Potential enhancement for downstream NLP tasks

Abstract

Discourse information, as postulated by popular discourse theories, such as RST and PDTB, has been shown to improve an increasing number of downstream NLP tasks, showing positive effects and synergies of discourse with important real-world applications. While methods for incorporating discourse become more and more sophisticated, the growing need for robust and general discourse structures has not been sufficiently met by current discourse parsers, usually trained on small scale datasets in a strictly limited number of domains. This makes the prediction for arbitrary tasks noisy and unreliable. The overall resulting lack of high-quality, high-quantity discourse trees poses a severe limitation to further progress. In order the alleviate this shortcoming, we propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems