Learning Recursive Segments for Discourse Parsing
Stergos Afantenos, Pascal Denis, Philippe Muller, Laurence Danlos

TL;DR
This paper introduces a novel discourse segmentation method capable of detecting nested elementary discourse units, addressing limitations of previous linear-sequence assumptions, and demonstrates promising results on French discourse data.
Contribution
It presents a simple, effective approach for nested discourse segmentation using classification and a heuristic, advancing discourse parsing capabilities.
Findings
Achieved 73% F-score on EDU detection
First to handle nested EDUs in discourse segmentation
Validated on French discourse annotations
Abstract
Automatically detecting discourse segments is an important preliminary step towards full discourse parsing. Previous research on discourse segmentation have relied on the assumption that elementary discourse units (EDUs) in a document always form a linear sequence (i.e., they can never be nested). Unfortunately, this assumption turns out to be too strong, for some theories of discourse like SDRT allows for nested discourse units. In this paper, we present a simple approach to discourse segmentation that is able to produce nested EDUs. Our approach builds on standard multi-class classification techniques combined with a simple repairing heuristic that enforces global coherence. Our system was developed and evaluated on the first round of annotations provided by the French Annodis project (an ongoing effort to create a discourse bank for French). Cross-validated on only 47 documents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
