HE-SNR: Uncovering Latent Logic via Entropy for Guiding Mid-Training on SWE-bench
Yueyang Wang, Jiawei Fu, Baolong Bi, Xili Wang, Xiaoqing Liu

TL;DR
This paper introduces HE-SNR, a new entropy-based metric to guide mid-training of large language models on software engineering tasks, addressing limitations of traditional metrics like perplexity.
Contribution
It proposes the Entropy Compression Hypothesis and develops HE-SNR, a novel metric that better correlates with downstream performance in large language models.
Findings
HE-SNR correlates strongly with downstream SWE performance.
Validated on models up to 560B parameters with different context windows.
Provides theoretical and practical tools for optimizing LLMs in complex domains.
Abstract
SWE-bench has emerged as the premier benchmark for evaluating Large Language Models on complex software engineering tasks. While these capabilities are fundamentally acquired during the mid-training phase and subsequently elicited during Supervised Fine-Tuning (SFT), there remains a critical deficit in metrics capable of guiding mid-training effectively. Standard metrics such as Perplexity (PPL) are compromised by the "Long-Context Tax" and exhibit weak correlation with downstream SWE performance. In this paper, we bridge this gap by first introducing a rigorous data filtering strategy. Crucially, we propose the Entropy Compression Hypothesis, redefining intelligence not by scalar Top-1 compression, but by the capacity to structure uncertainty into Entropy-Compressed States of low orders ("reasonable hesitation"). Grounded in this fine-grained entropy analysis, we formulate a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
