SAGE-32B: Agentic Reasoning via Iterative Distillation

Basab Jha; Firoj Paudel; Ujjwal Puri; Ethan Henkel; Zhang Yuting; Mateusz Kowalczyk; Mei Huang; Choi Donghyuk; Wang Junhao

arXiv:2601.04237·cs.AI·April 22, 2026

SAGE-32B: Agentic Reasoning via Iterative Distillation

Basab Jha, Firoj Paudel, Ujjwal Puri, Ethan Henkel, Zhang Yuting, Mateusz Kowalczyk, Mei Huang, Choi Donghyuk, Wang Junhao

PDF

1 Repo

TL;DR

SAGE-32B is a 32-billion-parameter language model designed for agentic reasoning, utilizing iterative distillation and inverse reasoning to improve task decomposition, tool use, and error recovery in long-range planning tasks.

Contribution

The paper introduces SAGE-32B, a large language model trained with iterative distillation and inverse reasoning, specifically optimized for agentic reasoning and planning tasks.

Findings

01

SAGE-32B outperforms baseline models on agentic reasoning benchmarks.

02

The model demonstrates improved multi-tool usage success rates.

03

It maintains competitive performance on standard reasoning tasks.

Abstract

We demonstrate SAGE-32B, a 32 billion parameter language model that focuses on agentic reasoning and long range planning tasks. Unlike chat models that aim for general conversation fluency, SAGE-32B is designed to operate in an agentic loop, emphasizing task decomposition, tool usage, and error recovery. The model is initialized from the Qwen2.5-32B pretrained model and fine tuned using Iterative Distillation, a two stage training process that improves reasoning performance through rigorously tested feedback loops. SAGE-32B also introduces an inverse reasoning approach, which uses a meta cognition head to forecast potential failures in the planning process before execution. On agentic reasoning benchmarks including MMLU-Pro, AgentBench, and MATH-500, SAGE-32B achieves higher success rates in multi tool usage scenarios compared to similarly sized baseline models, while remaining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/sagea-ai/sage-reasoning-32b
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.