Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Sen Xu; Yi Zhou; Wei Wang; Jixin Min; Zhibin Yin; Yingwei Dai; Shixi Liu; Lianyu Pang; Yirong Chen; Junlin Zhang

arXiv:2511.06221·cs.AI·November 11, 2025

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Sen Xu, Yi Zhou, Wei Wang, Jixin Min, Zhibin Yin, Yingwei Dai, Shixi Liu, Lianyu Pang, Yirong Chen, Junlin Zhang

PDF

Open Access 3 Models

TL;DR

This paper introduces VibeThinker-1.5B, a small model that achieves large-model reasoning abilities through diversity-driven optimization, challenging the notion that only large models can perform robust reasoning tasks.

Contribution

The paper presents the Spectrum-to-Signal Principle and a two-stage diversity-driven training method to enable small models to attain large-model reasoning capabilities.

Findings

01

VibeThinker-1.5B outperforms larger models on math benchmarks.

02

It achieves comparable reasoning abilities to open-source models with significantly lower training costs.

03

Small models can democratize AI research by matching large-model performance.

Abstract

Challenging the prevailing consensus that small models inherently lack robust reasoning, this report introduces VibeThinker-1.5B, a 1.5B-parameter dense model developed via our Spectrum-to-Signal Principle (SSP). This challenges the prevailing approach of scaling model parameters to enhance capabilities, as seen in models like DeepSeek R1 (671B) and Kimi k2 (>1T). The SSP framework first employs a Two-Stage Diversity-Exploring Distillation (SFT) to generate a broad spectrum of solutions, followed by MaxEnt-Guided Policy Optimization (RL) to amplify the correct signal. With a total training cost of only $7,800, VibeThinker-1.5B demonstrates superior reasoning capabilities compared to closed-source models like Magistral Medium and Claude Opus 4, and performs on par with open-source models like GPT OSS-20B Medium. Remarkably, it surpasses the 400x larger DeepSeek R1 on three math…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Machine Learning and Data Classification