InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling

Peiji Li; Jiasheng Ye; Yongkang Chen; Yichuan Ma; Zijie Yu; Kedi Chen; Xiaozhe Li; Ganqu Cui; Haozhan Li; Jiacheng Chen; Chengqi Lyu; Wenwei Zhang; Linyang Li; Qipeng Guo; Dahua Lin; Bowen Zhou; Kai Chen

arXiv:2508.08636·cs.CL·May 21, 2026

InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling

Peiji Li, Jiasheng Ye, Yongkang Chen, Yichuan Ma, Zijie Yu, Kedi Chen, Xiaozhe Li, Ganqu Cui, Haozhan Li, Jiacheng Chen, Chengqi Lyu, Wenwei Zhang, Linyang Li, Qipeng Guo, Dahua Lin, Bowen Zhou, Kai Chen

PDF

TL;DR

InternBootcamp is an open-source framework with over 1000 diverse reasoning tasks for LLMs, enabling automated data generation, verification, and significant performance improvements through task scaling.

Contribution

We introduce InternBootcamp, a comprehensive, automated task environment framework that enhances LLM reasoning research and demonstrates the benefits of large-scale task scaling.

Findings

01

Training with InternBootcamp significantly improves model performance.

02

Including more diverse tasks leads to substantial reasoning ability gains.

03

The 32B model achieves state-of-the-art results on Bootcamp-EVAL.

Abstract

Large language models (LLMs) have revolutionized artificial intelligence by enabling complex reasoning capabilities. While recent advancements in reinforcement learning (RL) have primarily focused on domain-specific reasoning tasks (e.g., mathematics or code generation), real-world reasoning scenarios often require models to handle diverse and complex environments that narrow-domain benchmarks cannot fully capture. To address this gap, we present InternBootcamp, an open-source framework comprising 1000+ domain-diverse task environments specifically designed for LLM reasoning research. Our codebase offers two key functionalities: (1) automated generation of unlimited training/testing cases with configurable difficulty levels, and (2) integrated verification modules for objective response evaluation. These features make InternBootcamp fundamental infrastructure for RL-based model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications