Bunsetsu Identification Using Category-Exclusive Rules

Masaki Murata; Kiyotaka Uchimoto; Qing Ma; Hitoshi Isahara

arXiv:cs/0008031·cs.CL·May 23, 2007

Bunsetsu Identification Using Category-Exclusive Rules

Masaki Murata, Kiyotaka Uchimoto, Qing Ma, Hitoshi Isahara

PDF

Open Access

TL;DR

This paper introduces two new supervised learning methods for bunsetsu identification in Japanese, demonstrating that category-exclusive rules with high similarity outperform existing machine learning approaches.

Contribution

The paper presents novel bunsetsu identification methods using category-exclusive rules, improving accuracy over traditional machine learning techniques.

Findings

01

Category-exclusive rule methods outperform existing models

02

High similarity in rules yields best performance

03

Experimental results favor new rule-based approach

Abstract

This paper describes two new bunsetsu identification methods using supervised learning. Since Japanese syntactic analysis is usually done after bunsetsu identification, bunsetsu identification is important for analyzing Japanese sentences. In experiments comparing the four previously available machine-learning methods (decision tree, maximum-entropy method, example-based approach and decision list) and two new methods using category-exclusive rules, the new method using the category-exclusive rules with the highest similarity performed best.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques