Cross-lingual RST Discourse Parsing
Chlo\'e Braud, Maximin Coavoux, Anders S{\o}gaard

TL;DR
This paper introduces a new, simpler discourse parser that outperforms existing models on English, harmonizes multiple language treebanks, and pioneers cross-lingual discourse parsing experiments.
Contribution
It presents a novel, competitive discourse parser, harmonizes multilingual discourse treebanks, and conducts pioneering cross-lingual discourse parsing experiments.
Findings
The new parser outperforms state-of-the-art on 2/3 metrics for English.
Harmonization enables cross-lingual discourse parsing.
First experiments demonstrate feasibility of cross-lingual discourse analysis.
Abstract
Discourse parsing is an integral part of understanding information flow and argumentative structure in documents. Most previous research has focused on inducing and evaluating models from the English RST Discourse Treebank. However, discourse treebanks for other languages exist, including Spanish, German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same underlying linguistic theory, but differ slightly in the way documents are annotated. In this paper, we present (a) a new discourse parser which is simpler, yet competitive (significantly better on 2/3 metrics) to state of the art for English, (b) a harmonization of discourse treebanks across languages, enabling us to present (c) what to the best of our knowledge are the first experiments on cross-lingual discourse parsing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
