Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean

SungHo Kim; Nayeon Kim; Taehee Jeon; SangKeun Lee

arXiv:2506.01237·cs.CL·June 3, 2025

Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean

SungHo Kim, Nayeon Kim, Taehee Jeon, SangKeun Lee

PDF

Open Access 1 Repo

TL;DR

This paper introduces KoGEM, a comprehensive Korean language benchmark for evaluating LLMs and humans, revealing strengths in definitional tasks and weaknesses in experiential knowledge integration.

Contribution

The paper presents KoGEM, a new Korean linguistic benchmark, and provides an extensive evaluation of LLMs, highlighting their limitations and potential areas for improvement in linguistic competence.

Findings

01

LLMs perform well on definitional tasks

02

LLMs struggle with phonological and pronunciation tasks

03

Incorporating experiential knowledge could improve LLMs

Abstract

We introduce the $\underline{K o} r e an \underline{G} r amma r \underline{E} v a l u a t i o n B e n c h \underline{M} a r k (K o GE M)$ , designed to assess the linguistic competence of LLMs and humans in Korean. KoGEM consists of 1.5k multiple-choice QA pairs covering five main categories and 16 subcategories. The zero-shot evaluation of 27 LLMs of various sizes and types reveals that while LLMs perform remarkably well on straightforward tasks requiring primarily definitional knowledge, they struggle with tasks that demand the integration of real-world experiential knowledge, such as phonological rules and pronunciation. Furthermore, our in-depth analysis suggests that incorporating such experiential knowledge could enhance the linguistic competence of LLMs. With KoGEM, we not only highlight the limitations of current LLMs in linguistic competence but also uncover hidden facets of LLMs in linguistic competence,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sungho3268/kogem
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBorder Security and International Relations · Technology and Data Analysis · Educational Reforms and Innovations