There Are No Silly Questions: Evaluation of Offline LLM Capabilities from a Turkish Perspective
Edibe Yilmaz, Kahraman Kostas

TL;DR
This paper evaluates the robustness and safety of offline large language models in Turkish heritage education, highlighting the importance of anomaly resistance and biases for pedagogical safety.
Contribution
It introduces the Turkish Anomaly Suite (TAS) for systematic evaluation of offline LLMs in Turkish education contexts, revealing insights into model robustness and biases.
Findings
Anomaly resistance is not solely dependent on model size.
Sycophancy bias can threaten pedagogical safety.
8B--14B parameter models offer the best cost-safety balance.
Abstract
The integration of large language models (LLMs) into educational processes introduces significant constraints regarding data privacy and reliability, particularly in pedagogically vulnerable contexts such as Turkish heritage language education. This study aims to systematically evaluate the robustness and pedagogical safety of locally deployable offline LLMs within the context of Turkish heritage language education. To this end, a Turkish Anomaly Suite (TAS) consisting of 10 original edge-case scenarios was developed to assess the models' capacities for epistemic resistance, logical consistency, and pedagogical safety. Experiments conducted on 14 different models ranging from 270M to 32B parameters reveal that anomaly resistance is not solely dependent on model scale and that sycophancy bias can pose pedagogical risks even in large-scale models. The findings indicate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Improvement · Text Readability and Simplification
