Thai Rhetorical Structure Analysis
Somnuk Sinthupoun, Ohm Sornil

TL;DR
This paper presents a novel approach for Thai rhetorical structure analysis, utilizing specialized segmentation, clustering, and decision trees to identify discourse relations in Thai texts, aiding various NLP tasks.
Contribution
It introduces a unique method tailored for Thai language, combining HMM-based EDU segmentation, semantic-rule-based clustering, and decision trees for discourse relation classification.
Findings
Effective segmentation of EDUs in Thai using HMMs
Successful construction of rhetorical structure trees based on semantic similarity
Accurate classification of discourse relations with decision trees
Abstract
Rhetorical structure analysis (RSA) explores discourse relations among elementary discourse units (EDUs) in a text. It is very useful in many text processing tasks employing relationships among EDUs such as text understanding, summarization, and question-answering. Thai language with its distinctive linguistic characteristics requires a unique technique. This article proposes an approach for Thai rhetorical structure analysis. First, EDUs are segmented by two hidden Markov models derived from syntactic rules. A rhetorical structure tree is constructed from a clustering technique with its similarity measure derived from Thai semantic rules. Then, a decision tree whose features derived from the semantic rules is used to determine discourse relations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
