KAT-Coder Technical Report

Zizheng Zhan; Ken Deng; Jinghui Wang; Xiaojiang Zhang; Huaixi Tang; Minglei Zhang; Zhiyi Lai; Haoyang Huang; Wen Xiang; Kun Wu; Wenhao Zhuang; Shaojie Wang; Shangpeng Yan; Kepeng Lei; Zongxian Feng; Huiming Wang; Zheng Lin; Mengtong Li; Mengfei Xie; Yinghan Cui; Xuxing Chen; Chao Wang; Weihao Li; Wenqiang Zhu; Jiarong Zhang; Jingxuan Xu; Songwei Yu; Yifan Yao; Xinping Lei; C. Zhang; Han Li; Junqi Xiong; Zuchen Gao; Dailin Li; Haimo Li; Jiaheng Liu; Yuqun Zhang; Junyi Peng; Haotian Zhang; Bin Chen

arXiv:2510.18779·cs.CL·November 3, 2025

KAT-Coder Technical Report

Zizheng Zhan, Ken Deng, Jinghui Wang, Xiaojiang Zhang, Huaixi Tang, Minglei Zhang, Zhiyi Lai, Haoyang Huang, Wen Xiang, Kun Wu, Wenhao Zhuang, Shaojie Wang, Shangpeng Yan, Kepeng Lei, Zongxian Feng, Huiming Wang, Zheng Lin, Mengtong Li, Mengfei Xie, Yinghan Cui, Xuxing Chen

PDF

Open Access

TL;DR

KAT-Coder is a large-scale agentic code model trained through a multi-stage curriculum, enabling robust reasoning, planning, and deployment in real-world coding environments, and is open-sourced for community use.

Contribution

The paper introduces KAT-Coder, a novel multi-stage training process for agentic coding models that improves reasoning, tool use, and deployment capabilities.

Findings

01

Achieves robust tool-use reliability in coding tasks

02

Demonstrates instruction alignment and long-context reasoning

03

Open-sourced KAT-Dev model for community use

Abstract

Recent advances in large language models (LLMs) have enabled progress in agentic coding, where models autonomously reason, plan, and act within interactive software development workflows. However, bridging the gap between static text-based training and dynamic real-world agentic execution remains a core challenge. In this technical report, we present KAT-Coder, a large-scale agentic code model trained through a multi-stage curriculum encompassing Mid-Term Training, Supervised Fine-Tuning (SFT), Reinforcement Fine-Tuning (RFT), and Reinforcement-to-Deployment Adaptation. The Mid-Term stage enhances reasoning, planning, and reflection capabilities through a corpus of real software engineering data and synthetic agentic interactions. The SFT stage constructs a million-sample dataset balancing twenty programming languages, ten development contexts, and ten task archetypes. The RFT stage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Reinforcement Learning in Robotics · Software Engineering Techniques and Practices