Learning Dynamics in RL Post-Training for Language Models

Akiyoshi Tomihari

arXiv:2601.04670·cs.LG·January 9, 2026

Learning Dynamics in RL Post-Training for Language Models

Akiyoshi Tomihari

PDF

Open Access

TL;DR

This paper analyzes the learning dynamics of reinforcement learning post-training in language models using an NTK framework, revealing how limited feature variability increases model confidence and reduces output diversity, and proposes a new classifier-first training strategy.

Contribution

It introduces an NTK-based analysis of RL post-training dynamics, explaining confidence increase and output diversity reduction, and proposes the CF-RL method to improve training efficiency.

Findings

01

RL updates increase model confidence due to limited feature variability

02

CF-RL accelerates training and enhances model confidence

03

The mechanism of CF-RL differs from supervised linear probing

Abstract

Reinforcement learning (RL) post-training is a critical stage in modern language model development, playing a key role in improving alignment and reasoning ability. However, several phenomena remain poorly understood, including the reduction in output diversity. To gain a broader understanding of RL post-training, we analyze the learning dynamics of RL post-training from a perspective that has been studied in supervised learning but remains underexplored in RL. We adopt an empirical neural tangent kernel (NTK) framework and decompose the NTK into two components to characterize how RL updates propagate across training samples. Our analysis reveals that limited variability in feature representations can cause RL updates to systematically increase model confidence, providing an explanation for the commonly observed reduction in output diversity after RL post-training. Furthermore, we show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques