Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Junru Lu; Jiarui Qin; Lingfeng Qiao; Yinghui Li; Xinyi Dai; Bo Ke; Jianfeng He; Ruizhi Qiao; Di Yin; Xing Sun; Yunsheng Wu; Yinsong Liu; Shuangyin Liu; Mingkong Tang; Haodong Lin; Jiayi Kuang; Fanxu Meng; Xiaojuan Tang; Yunjia Xi; Junjie Huang; Haotong Yang; Zhenyi Shen; Yangning Li; Qianwen Zhang; Yifei Yu; Siyu An; Junnan Dong; Qiufeng Wang; Jie Wang; Keyu Chen; Wei Wen; Taian Guo; Zhifeng Shen; Daohai Yu; Jiahao Li; Ke Li; Zongyi Li; Xiaoyu Tan

arXiv:2512.24618·cs.CL·January 6, 2026

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Junru Lu, Jiarui Qin, Lingfeng Qiao, Yinghui Li, Xinyi Dai, Bo Ke, Jianfeng He, Ruizhi Qiao, Di Yin, Xing Sun, Yunsheng Wu, Yinsong Liu, Shuangyin Liu, Mingkong Tang, Haodong Lin, Jiayi Kuang, Fanxu Meng, Xiaojuan Tang, Yunjia Xi, Junjie Huang, Haotong Yang, Zhenyi Shen

PDF

Open Access 8 Models 2 Datasets

TL;DR

Youtu-LLM is a lightweight 1.96B language model designed with a novel architecture and training curriculum to achieve strong reasoning, planning, and agentic abilities comparable to larger models.

Contribution

It introduces a compact Multi-Latent Attention architecture with long-context support and a multi-stage curriculum for training lightweight LLMs with agentic capabilities.

Findings

01

Sets new state-of-the-art for sub-2B LLMs.

02

Achieves competitive performance on general benchmarks.

03

Surpasses existing SOTA on agent-specific tasks.

Abstract

We introduce Youtu-LLM, a lightweight yet powerful language model that harmonizes high computational efficiency with native agentic intelligence. Unlike typical small models that rely on distillation, Youtu-LLM (1.96B) is pre-trained from scratch to systematically cultivate reasoning and planning capabilities. The key technical advancements are as follows: (1) Compact Architecture with Long-Context Support: Built on a dense Multi-Latent Attention (MLA) architecture with a novel STEM-oriented vocabulary, Youtu-LLM supports a 128k context window. This design enables robust long-context reasoning and state tracking within a minimal memory footprint, making it ideal for long-horizon agent and reasoning tasks. (2) Principled "Commonsense-STEM-Agent" Curriculum: We curated a massive corpus of approximately 11T tokens and implemented a multi-stage training strategy. By progressively shifting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)