TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Chansung Park; Juyong Jiang; Fan Wang; Sayak Paul; Jiasi Shen; Jing Tang; Jianguo Li

arXiv:2602.15449·cs.CL·February 18, 2026

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jiasi Shen, Jing Tang, Jianguo Li

PDF

Open Access

TL;DR

TAROT introduces a capability-adaptive curriculum reinforcement fine-tuning method for large language models to improve code generation by systematically constructing multi-tier test suites and decoupling curriculum progression from raw reward scores.

Contribution

This paper presents TAROT, a novel curriculum reinforcement fine-tuning approach that adaptively tailors difficulty levels based on model capability, enhancing code correctness and robustness.

Findings

01

Less capable models benefit from easy-to-hard curriculum progression.

02

More capable models perform better with a hard-first curriculum.

03

Adaptive curriculum design improves code generation quality.

Abstract

Large Language Models (LLMs) are changing the coding paradigm, known as vibe coding, yet synthesizing algorithmically sophisticated and robust code still remains a critical challenge. Incentivizing the deep reasoning capabilities of LLMs is essential to overcoming this hurdle. Reinforcement Fine-Tuning (RFT) has emerged as a promising strategy to address this need. However, most existing approaches overlook the heterogeneous difficulty and granularity inherent in test cases, leading to an imbalanced distribution of reward signals and consequently biased gradient updates during training. To address this, we propose Test-driven and cApability-adaptive cuRriculum reinfOrcement fine-Tuning (TAROT). TAROT systematically constructs, for each problem, a four-tier test suite (basic, intermediate, complex, edge), providing a controlled difficulty landscape for curriculum design and evaluation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification