Optimizing Korean-Centric LLMs via Token Pruning

Hoyeol Kim; Hyeonwoo Kim

arXiv:2604.16235·cs.CL·April 20, 2026

Optimizing Korean-Centric LLMs via Token Pruning

Hoyeol Kim, Hyeonwoo Kim

PDF

TL;DR

This paper systematically benchmarks multilingual LLMs with token pruning for Korean NLP, showing improved stability, translation performance, and memory efficiency, with architecture-dependent effects on instruction following.

Contribution

It demonstrates that token pruning effectively optimizes multilingual LLMs for Korean tasks, enhancing stability and efficiency while maintaining performance.

Findings

01

Token pruning improves generation stability by reducing language confusion.

02

Machine translation performance on Korean tasks is often enhanced by token pruning.

03

Vocabulary reduction leads to significant memory savings with modest latency gains.

Abstract

This paper presents a systematic benchmark of state-of-the-art multilingual large language models (LLMs) adapted via token pruning - a compression technique that eliminates tokens and embedding parameters corresponding to languages irrelevant to the target application. Focusing on Korean-centric natural language processing (NLP) tasks, we evaluate architectures including Qwen3, Gemma-3, Llama-3, and Aya across three vocabulary configurations: Original, English-Korean (EnKo), and English-Korean-Chinese (EnKoZh). Performance is assessed using established benchmarks for general aptitude, cultural literacy, instruction following, and machine translation. Our findings indicate that token pruning significantly improves generation stability by eliminating language confusion, and in the case of machine translation, frequently enhances performance on Korean-specific tasks. While…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.