Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models

Nikolaos Giarelis; Charalampos Mastrokostas; Nikos Karacapilidis

arXiv:2605.01870·cs.CL·May 5, 2026

Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models

Nikolaos Giarelis, Charalampos Mastrokostas, Nikos Karacapilidis

PDF

2 Models 1 Datasets

TL;DR

Maistros is a Greek large language model developed through knowledge distillation from reasoning models, featuring a new dataset and evaluation framework to improve Greek NLP tasks.

Contribution

The paper introduces CulturaQA, a Greek QA dataset, and Maistros 8B, a new Greek LLM created via knowledge distillation, addressing resource limitations.

Findings

01

Maistros 8B outperforms existing Greek LLMs on QA tasks.

02

The evaluation framework is adaptable to various languages and tasks.

03

CulturaQA enhances Greek NLP model training and assessment.

Abstract

Large Language Models (LLMs) have substantially advanced the field of Natural Language Processing (NLP), achieving state-of-the-art performance across a wide range of tasks. These improvements have been attributed, in part, to their emerging reasoning capabilities, which are enabled by large-scale training and increased model capacity. However, existing LLMs can generate erroneous responses when addressing complex queries that fall outside their training distribution, due to limited internal knowledge or the need for multi-step reasoning. To address these limitations, recent work has introduced large reasoning models (LRMs), which incorporate explicit internal reasoning processes to improve response accuracy. Additionally, state-of-the-art LRMs often comprise hundreds of billions of parameters and require several seconds per inference, even on advanced multi-GPU systems. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

IMISLab/CulturaQA
dataset· 178 dl
178 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.