Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Zehao Chen, Gongxun Li, Tianxiang Ai, Yifei Li, Zixuan Huang, Wang Zhou, Fuzhen Zhuang, Xianglong Liu, Jianxin Li, Deqing Wang, Yikun Ban

TL;DR
This paper introduces WMSS, a post-training method that uses weak model checkpoints to identify and recover learning gaps, enabling large language models to surpass saturation limits and improve performance without extra inference costs.
Contribution
The paper presents WMSS, a novel post-training paradigm that leverages weak checkpoints to enhance model performance beyond traditional saturation points.
Findings
Improves mathematical reasoning and code generation performance.
Achieves performance gains without additional inference costs.
Effectively recovers learning gaps through entropy-based analysis.
Abstract
As post-training optimization becomes central to improving large language models, we observe a persistent saturation bottleneck: once models grow highly confident, further training yields diminishing returns. While existing methods continue to reinforce target predictions, we find that informative supervision signals remain latent in models' own historical weak states. Motivated by this observation, we propose WMSS (Weak Agents Can Make Strong Agents Stronger), a post-training paradigm that leverages weak checkpoints to guide continued optimization. By identifying recoverable learning gaps via entropy dynamics and reinforcing them through compensatory learning, WMSS enables strong agents to improve beyond conventional post-training saturation. Experiments on mathematical reasoning and code generation datasets show that agents trained with our approach achieve effective performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Machine Learning and Data Classification · Topic Modeling
