KAT-V1: Kwai-AutoThink Technical Report
Zizheng Zhan, Ken Deng, Huaixi Tang, Wen Xiang, Kun Wu, Weihao Li, Wenqiang Zhu, Jingxuan Xu, Lecheng Huang, Zongxian Feng, Shaojie Wang, Shangpeng Yan, Xuxing Chen, Jiaheng Liu, Zhongyuan Peng, Zuchen Gao, Haoyang Huang, Xiaojiang Zhang, Jinghui Wang, Zheng Lin, Mengtong Li

TL;DR
KAT-V1 is an open-source 40B large language model designed to improve reasoning tasks by dynamically switching modes, utilizing novel training strategies, and demonstrating superior performance and efficiency across benchmarks and real-world applications.
Contribution
The paper introduces the AutoThink paradigm, dual-regime dataset construction, Multi-Token Prediction knowledge distillation, and Step-SRPO reinforcement learning, advancing reasoning efficiency and effectiveness in large language models.
Findings
KAT-V1 outperforms state-of-the-art models on reasoning benchmarks.
KAT reduces token usage while maintaining high accuracy.
Deployment in Kwaipilot enhances real-world coding workflows.
Abstract
We present Kwaipilot-AutoThink (KAT), an open-source 40B large language model developed to address the overthinking problem in reasoning-intensive tasks, where an automatic thinking training paradigm is proposed to dynamically switch between reasoning and non-reasoning modes based on task complexity. Specifically, first, we construct the dual-regime dataset based on a novel tagging pipeline and a multi-agent synthesis strategy, and then we apply Multi-Token Prediction (MTP)-enhanced knowledge distillation, enabling efficient and fine-grained reasoning transfer with minimal pretraining cost. Besides, we implement a cold-start initialization strategy that introduces mode-selection priors using majority-vote signals and intent-aware prompting. Finally, we propose Step-SRPO, a reinforcement learning algorithm that incorporates intermediate supervision into the GRPO framework, offering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
