Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models
Nikolaos Giarelis, Charalampos Mastrokostas, Nikos Karacapilidis

TL;DR
Maistros is a Greek large language model developed through knowledge distillation from reasoning models, featuring a new dataset and evaluation framework to improve Greek NLP tasks.
Contribution
The paper introduces CulturaQA, a Greek QA dataset, and Maistros 8B, a new Greek LLM created via knowledge distillation, addressing resource limitations.
Findings
Maistros 8B outperforms existing Greek LLMs on QA tasks.
The evaluation framework is adaptable to various languages and tasks.
CulturaQA enhances Greek NLP model training and assessment.
Abstract
Large Language Models (LLMs) have substantially advanced the field of Natural Language Processing (NLP), achieving state-of-the-art performance across a wide range of tasks. These improvements have been attributed, in part, to their emerging reasoning capabilities, which are enabled by large-scale training and increased model capacity. However, existing LLMs can generate erroneous responses when addressing complex queries that fall outside their training distribution, due to limited internal knowledge or the need for multi-step reasoning. To address these limitations, recent work has introduced large reasoning models (LRMs), which incorporate explicit internal reasoning processes to improve response accuracy. Additionally, state-of-the-art LRMs often comprise hundreds of billions of parameters and require several seconds per inference, even on advanced multi-GPU systems. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
