Part of Speech Tagging of Marathi Text Using Trigram Method

Jyoti Singh; Nisheeth Joshi; Iti Mathur

arXiv:1307.4299·cs.CL·July 17, 2013

Part of Speech Tagging of Marathi Text Using Trigram Method

Jyoti Singh, Nisheeth Joshi, Iti Mathur

PDF

TL;DR

This paper develops a statistical trigram-based part of speech tagger for Marathi, a morphologically rich language, demonstrating its development and evaluation for improved POS tagging accuracy.

Contribution

It introduces the first Marathi POS tagger using trigram methodology, tailored for the language's morphological complexity.

Findings

01

The tagger effectively predicts POS tags based on previous two tags.

02

Evaluation results show promising accuracy in Marathi POS tagging.

03

The trigram approach outperforms simpler models in this context.

Abstract

In this paper we present a Marathi part of speech tagger. It is a morphologically rich language. It is spoken by the native people of Maharashtra. The general approach used for development of tagger is statistical using trigram Method. The main concept of trigram is to explore the most likely POS for a token based on given information of previous two tags by calculating probabilities to determine which is the best sequence of a tag. In this paper we show the development of the tagger. Moreover we have also shown the evaluation done.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.