On Predicting the Post-training Potential of Pre-trained LLMs

Xiaoyuan Li; Yubo Ma; Kexin Yang; Moxin Li; Keqin Bao; Wenie Wang; Fuli Feng; Dayiheng Liu

arXiv:2605.11978·cs.CL·May 13, 2026

On Predicting the Post-training Potential of Pre-trained LLMs

Xiaoyuan Li, Yubo Ma, Kexin Yang, Moxin Li, Keqin Bao, Wenie Wang, Fuli Feng, Dayiheng Liu

PDF

TL;DR

This paper introduces RuDE, a framework for predicting a base LLM's post-training performance using response discrimination, enabling efficient model selection without extensive training.

Contribution

The paper presents RuDE, a novel approach that accurately forecasts post-training potential of LLMs through a unified, contrastive evaluation framework based on a new taxonomy.

Findings

01

RuDE achieves over 90% correlation with actual post-training performance.

02

Validation via RL shows RuDE can identify high-potential small models.

03

RuDE enables compute-efficient foundation model development.

Abstract

The performance of Large Language Models (LLMs) on downstream tasks is fundamentally constrained by the capabilities acquired during pre-training. However, traditional benchmarks like MMLU often fail to reflect a base model's plasticity in complex open-ended scenarios, leading to inefficient model selection. We address this by introducing a new task of predicting post-training potential - forecasting a base model's performance before post-training. We propose RuDE (Rubric-based Discriminative Evaluation), a unified framework that bypasses the generation gap of base models by leveraging response discrimination. Guided by our systematic 4C Taxonomy, RuDE constructs controlled contrastive pairs across diverse domains by fine-grained rubric violations. Extensive experiments demonstrate a correlation greater than 90% with post-training performance. Crucially, validation via Reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.