WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning

Haosen Ge; Shuo Li; Lianghuan Huang

arXiv:2508.16741·cs.LG·August 26, 2025

WST: Weak-to-Strong Knowledge Transfer via Reinforcement Learning

Haosen Ge, Shuo Li, Lianghuan Huang

PDF

TL;DR

WST introduces an efficient reinforcement learning-based framework where small 'Teacher' models generate prompts to significantly improve the performance of larger 'Student' models across reasoning and alignment tasks, without needing large models to be open-source or fine-tuned.

Contribution

The paper proposes WST, a novel weak-to-strong knowledge transfer method that uses reinforcement learning to automatically generate prompts, enabling small models to effectively enhance larger models' performance.

Findings

01

Achieves 98% on MATH-500 benchmark

02

Achieves 134% on HH-RLHF benchmark

03

Outperforms baselines like GPT-4o-mini and Llama-70B

Abstract

Effective prompt engineering remains a challenging task for many applications. We introduce Weak-to-Strong Transfer (WST), an automatic prompt engineering framework where a small "Teacher" model generates instructions that enhance the performance of a much larger "Student" model. Unlike prior work, WST requires only a weak teacher, making it efficient and broadly applicable in settings where large models are closed-source or difficult to fine-tune. Using reinforcement learning, the Teacher Model's instructions are iteratively improved based on the Student Model's outcomes, yielding substantial gains across reasoning (MATH-500, GSM8K) and alignment (HH-RLHF) benchmarks - 98% on MATH-500 and 134% on HH-RLHF - and surpassing baselines such as GPT-4o-mini and Llama-70B. These results demonstrate that small models can reliably scaffold larger ones, unlocking latent capabilities while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.