LLM Probe: Evaluating LLMs for Low-Resource Languages

Hailay Kidu Teklehaymanot; Gebrearegawi Gebremariam; Wolfgang Nejdl

arXiv:2603.29517·cs.CL·April 1, 2026

LLM Probe: Evaluating LLMs for Low-Resource Languages

Hailay Kidu Teklehaymanot, Gebrearegawi Gebremariam, Wolfgang Nejdl

PDF

TL;DR

This paper introduces LLM Probe, a systematic evaluation framework for assessing the linguistic capabilities of large language models in low-resource languages, using a new annotated benchmark dataset.

Contribution

The paper presents a novel lexicon-based assessment framework and a benchmark dataset for evaluating LLMs in low-resource, morphologically rich languages.

Findings

01

Sequence-to-sequence models outperform in morphosyntactic tasks and translation.

02

Causal models excel in lexical alignment but perform weaker in translation.

03

High inter-annotator agreement validates the reliability of the dataset.

Abstract

Despite rapid advances in large language models (LLMs), their linguistic abilities in low-resource and morphologically rich languages are still not well understood due to limited annotated resources and the absence of standardized evaluation frameworks. This paper presents LLM Probe, a lexicon-based assessment framework designed to systematically evaluate the linguistic skills of LLMs in low-resource language environments. The framework analyzes models across four areas of language understanding: lexical alignment, part-of-speech recognition, morphosyntactic probing, and translation accuracy. To illustrate the framework, we create a manually annotated benchmark dataset using a low-resource Semitic language as a case study. The dataset comprises bilingual lexicons with linguistic annotations, including part-of-speech tags, grammatical gender, and morphosyntactic features, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.