Under-resourced studies of under-resourced languages: lemmatization and POS-tagging with LLM annotators for historical Armenian, Georgian, Greek and Syriac

Chahan Vidal-Gor\`ene (CJM; LIPN); Bastien Kindt (UCL); Florian Cafiero (PSL; CJM)

arXiv:2602.15753·cs.CL·February 18, 2026

Under-resourced studies of under-resourced languages: lemmatization and POS-tagging with LLM annotators for historical Armenian, Georgian, Greek and Syriac

Chahan Vidal-Gor\`ene (CJM, LIPN), Bastien Kindt (UCL), Florian Cafiero (PSL, CJM)

PDF

1 Video

TL;DR

This study evaluates the effectiveness of large language models like GPT-4 and Mistral in performing lemmatization and POS-tagging for four under-resourced, historically significant languages, highlighting their potential as annotation aids in low-data scenarios.

Contribution

It introduces a novel benchmark and demonstrates that LLMs can effectively perform linguistic annotation tasks in low-resource, diverse language contexts without fine-tuning.

Findings

01

LLMs achieve competitive performance in POS-tagging and lemmatization.

02

Few-shot LLM performance surpasses traditional RNN baselines in most cases.

03

Challenges remain for complex morphology and non-Latin scripts.

Abstract

Low-resource languages pose persistent challenges for Natural Language Processing tasks such as lemmatization and part-of-speech (POS) tagging. This paper investigates the capacity of recent large language models (LLMs), including GPT-4 variants and open-weight Mistral models, to address these tasks in few-shot and zero-shot settings for four historically and linguistically diverse under-resourced languages: Ancient Greek, Classical Armenian, Old Georgian, and Syriac. Using a novel benchmark comprising aligned training and out-of-domain test corpora, we evaluate the performance of foundation models across lemmatization and POS-tagging, and compare them with PIE, a task-specific RNN baseline. Our results demonstrate that LLMs, even without fine-tuning, achieve competitive or superior performance in POS-tagging and lemmatization across most languages in few-shot settings. Significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Under-resourced studies of under-resourced languages: lemmatization and POS-tagging with LLM annotators for historical Armenian, Georgian, Greek and Syriac· underline