Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction

Hongjin Kim; Jaewook Lee; Kiyoung Lee; Jong-hun Shin; Soojong Lim; Oh-Woog Kwon

arXiv:2601.05459·cs.CL·January 12, 2026

Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction

Hongjin Kim, Jaewook Lee, Kiyoung Lee, Jong-hun Shin, Soojong Lim, Oh-Woog Kwon

PDF

Open Access

TL;DR

This paper investigates whether reinforcement learning can improve Korean reasoning in LLMs and finds that neuron-level tuning and internal alignment are crucial for enhancing multilingual reasoning capabilities.

Contribution

It demonstrates that aligning internal reasoning processes through neuron-specific tuning significantly boosts RL effectiveness in low-resource languages.

Findings

01

RL alone yields limited improvements in Korean reasoning

02

Neuron-level tuning in early layers enhances reasoning abilities

03

Self-correction datasets facilitate internal reasoning alignment

Abstract

Large Language Models (LLMs) demonstrate strong reasoning and self-correction abilities in high-resource languages like English, but their performance remains limited in low-resource languages such as Korean. In this study, we investigate whether reinforcement learning (RL) can enhance Korean reasoning abilities to a degree comparable to English. Our findings reveal that RL alone yields limited improvements when applied to models lacking inherent Korean reasoning capabilities. To address this, we explore several fine-tuning strategies and show that aligning the model's internal reasoning processes with Korean inputs-particularly by tuning Korean-specific neurons in early layers-is key to unlocking RL's effectiveness. We introduce a self-correction code-switching dataset to facilitate this alignment and observe significant performance gains in both mathematical reasoning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications