TL;DR
HeavySkill introduces an internalized reasoning skill within models, enabling complex task solving through a two-stage pipeline, and demonstrates its effectiveness across domains, outperforming traditional strategies.
Contribution
The paper proposes HeavySkill as an internal reasoning skill, internalized within models, that enhances complex reasoning without relying solely on orchestration frameworks.
Findings
HeavySkill outperforms traditional Best-of-N strategies.
Stronger LLMs approach Pass@N performance with HeavySkill.
Reinforcement learning can scale HeavySkill's reasoning depth and width.
Abstract
Recent advances in agentic harness with orchestration frameworks that coordinate multiple agents with memory, skills, and tool use have achieved remarkable success in complex reasoning tasks. However, the underlying mechanism that truly drives performance remains obscured behind intricate system designs. In this paper, we propose HeavySkill, a perspective that views heavy thinking not only as a minimal execution unit in orchestration harness but also as an inner skill internalized within the model's parameters that drives the orchestrator to solve complex tasks. We identify this skill as a two-stage pipeline, i.e., parallel reasoning then summarization, which can operate beneath any agentic harness. We present a systematic empirical study of HeavySkill across diverse domains. Our results show that this inner skill consistently outperforms traditional Best-of-N (BoN) strategies; notably,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
