KoALa-Bench: Evaluating Large Audio Language Models on Korean Speech Understanding and Faithfulness

Jinyoung Kim; Hyeongsoo Lim; Eunseo Seo; Minho Jang; Keunwoo Choi; Seungyoun Shin; Ji Won Yoon

arXiv:2604.19782·cs.CL·April 23, 2026

KoALa-Bench: Evaluating Large Audio Language Models on Korean Speech Understanding and Faithfulness

Jinyoung Kim, Hyeongsoo Lim, Eunseo Seo, Minho Jang, Keunwoo Choi, Seungyoun Shin, Ji Won Yoon

PDF

1 Repo

TL;DR

KoALa-Bench is a comprehensive benchmark designed to evaluate Korean speech understanding and faithfulness of large audio language models, addressing the scarcity of non-English benchmarks.

Contribution

The paper introduces KoALa-Bench, a new benchmark with six tasks, including Korean-specific content, for evaluating LALMs on Korean speech understanding and faithfulness.

Findings

01

Extensive experiments conducted on six models.

02

Benchmark, code, and leaderboard publicly available.

03

Includes Korean cultural and academic content.

Abstract

Recent advances in large audio language models (LALMs) have enabled multilingual speech understanding. However, benchmarks for evaluating LALMs remain scarce for non-English languages, with Korean being one such underexplored case. In this paper, we introduce KoALa-Bench, a comprehensive benchmark for evaluating Korean speech understanding and speech faithfulness of LALMs. In particular, KoALa-Bench comprises six tasks. Four tasks evaluate fundamental speech understanding capabilities, including automatic speech recognition, speech translation, speech question answering, and speech instruction following, while the remaining two tasks evaluate speech faithfulness, motivated by our observation that several LALMs often fail to fully leverage the speech modality. Furthermore, to reflect Korea-specific knowledge, our benchmark incorporates listening questions from the Korean college…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://ksbench.github.io/Korean-Benchmark
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.