ESURF: Simple and Effective EDU Segmentation
Mohammadreza Sediqin, Shlomo Engelson Argamon

TL;DR
This paper introduces a simple, effective method for EDU segmentation using lexical and character n-gram features with random forest classification, outperforming existing methods and enhancing discourse parsing efficiency.
Contribution
The paper presents a novel, straightforward approach for EDU segmentation that leverages lexical and character n-grams, demonstrating superior performance over previous techniques.
Findings
Outperforms existing segmentation methods
Enhances discourse parser accuracy
Highlights importance of lexical features
Abstract
Segmenting text into Elemental Discourse Units (EDUs) is a fundamental task in discourse parsing. We present a new simple method for identifying EDU boundaries, and hence segmenting them, based on lexical and character n-gram features, using random forest classification. We show that the method, despite its simplicity, outperforms other methods both for segmentation and within a state of the art discourse parser. This indicates the importance of such features for identifying basic discourse elements, pointing towards potentially more training-efficient methods for discourse analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
