Context Aware Lemmatization and Morphological Tagging Method in Turkish
Cagri Sayallar

TL;DR
This paper introduces a Turkish lemmatization and morphological tagging model that incorporates word meaning, utilizing bidirectional LSTM and Turkish BERT, achieving superior results compared to existing models.
Contribution
First to develop a meaning-sensitive lemmatization and tagging model for Turkish using BERT and LSTM, advancing linguistic processing in Turkish NLP.
Findings
Models outperform SIGMORPHON 2019 results.
Incorporating meaning improves lemmatization accuracy.
Bidirectional LSTM and Turkish BERT effectively model Turkish morphology.
Abstract
The smallest part of a word that defines the word is called a word root. Word roots are used to increase success in many applications since they simplify the word. In this study, the lemmatization model, which is a word root finding method, and the morphological tagging model, which predicts the grammatical knowledge of the word, are presented. The presented model was developed for Turkish, and both models make predictions by taking the meaning of the word into account. In the literature, there is no lemmatization study that is sensitive to word meaning in Turkish. For this reason, the present study shares the model and the results obtained from the model on Turkish lemmatization for the first time in the literature. In the present study, in the lemmatization and morphological tagging models, bidirectional LSTM is used for the spelling of words, and the Turkish BERT model is used for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Robotics and Automated Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Tanh Activation · Layer Normalization · Sigmoid Activation · Dense Connections · Attention Dropout · WordPiece · Dropout · Linear Layer
