Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean
SungHo Kim, Nayeon Kim, Taehee Jeon, SangKeun Lee

TL;DR
This paper introduces KoGEM, a comprehensive Korean language benchmark for evaluating LLMs and humans, revealing strengths in definitional tasks and weaknesses in experiential knowledge integration.
Contribution
The paper presents KoGEM, a new Korean linguistic benchmark, and provides an extensive evaluation of LLMs, highlighting their limitations and potential areas for improvement in linguistic competence.
Findings
LLMs perform well on definitional tasks
LLMs struggle with phonological and pronunciation tasks
Incorporating experiential knowledge could improve LLMs
Abstract
We introduce the , designed to assess the linguistic competence of LLMs and humans in Korean. KoGEM consists of 1.5k multiple-choice QA pairs covering five main categories and 16 subcategories. The zero-shot evaluation of 27 LLMs of various sizes and types reveals that while LLMs perform remarkably well on straightforward tasks requiring primarily definitional knowledge, they struggle with tasks that demand the integration of real-world experiential knowledge, such as phonological rules and pronunciation. Furthermore, our in-depth analysis suggests that incorporating such experiential knowledge could enhance the linguistic competence of LLMs. With KoGEM, we not only highlight the limitations of current LLMs in linguistic competence but also uncover hidden facets of LLMs in linguistic competence,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBorder Security and International Relations · Technology and Data Analysis · Educational Reforms and Innovations
