KULTURE Bench: A Benchmark for Assessing Language Model in Korean Cultural Context
Xiaonan Wang, Jinyoung Yeo, Joon-Ho Lim, Hansaem Kim

TL;DR
KULTURE Bench is a specialized evaluation framework for Korean cultural understanding in language models, addressing limitations of existing benchmarks by focusing on culturally relevant content like news, idioms, and poetry.
Contribution
It introduces a culturally specific benchmark for Korean, enabling more accurate assessment of models' cultural comprehension and reasoning capabilities.
Findings
Models show room for improvement in Korean cultural understanding.
Benchmark reveals gaps in models' grasp of Korean idioms and poetry.
Evaluation across different training corpora highlights cultural knowledge disparities.
Abstract
Large language models have exhibited significant enhancements in performance across various tasks. However, the complexity of their evaluation increases as these models generate more fluent and coherent content. Current multilingual benchmarks often use translated English versions, which may incorporate Western cultural biases that do not accurately assess other languages and cultures. To address this research gap, we introduce KULTURE Bench, an evaluation framework specifically designed for Korean culture that features datasets of cultural news, idioms, and poetry. It is designed to assess language models' cultural comprehension and reasoning capabilities at the word, sentence, and paragraph levels. Using the KULTURE Bench, we assessed the capabilities of models trained with different language corpora and analyzed the results comprehensively. The results show that there is still…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Systems and Policies · Computational and Text Analysis Methods
