Tagging and Morphological Disambiguation of Turkish Text

Kemal Oflazer(Bilkent University; Ankara; Turkey); Ilker Kuruoz; (Bilkent University; Ankara; Turkey)

arXiv:cmp-lg/9407026·cmp-lg·February 3, 2008·6 cites

Tagging and Morphological Disambiguation of Turkish Text

Kemal Oflazer(Bilkent University, Ankara, Turkey), Ilker Kuruoz, (Bilkent University, Ankara, Turkey)

PDF

Open Access

TL;DR

This paper presents a Turkish POS tagger utilizing a comprehensive morphological model, achieving high accuracy and significantly reducing parsing ambiguity and time, with potential applicability to other languages.

Contribution

It introduces a novel Turkish POS tagger based on a detailed morphological specification and disambiguation approach, improving tagging accuracy and parsing efficiency.

Findings

01

Achieves 98-99% tagging accuracy

02

Reduces parsing ambiguity by 50%

03

Speeds up parsing by 2.5 times

Abstract

Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. In languages like Turkish or Finnish, with agglutinative morphology, morphological disambiguation is a very crucial process in tagging, as the structures of many lexical forms are morphologically ambiguous. This paper describes a POS tagger for Turkish text based on a full-scale two-level specification of Turkish morphology that is based on a lexicon of about 24,000 root words. This is augmented with a multi-word and idiomatic construct recognizer, and most importantly morphological disambiguator based on local neighborhood constraints, heuristics and limited amount of statistical information. The tagger also has functionality for statistics compilation and fine tuning of the morphological analyzer, such as logging…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies