DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning

Yicheng Chen; Zerun Ma; Xinchen Xie; Yining Li; Kai Chen

arXiv:2602.11089·cs.CL·March 9, 2026

DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning

Yicheng Chen, Zerun Ma, Xinchen Xie, Yining Li, Kai Chen

PDF

Open Access 1 Models

TL;DR

This paper introduces DataChef, an automated system that generates optimal data processing recipes for adapting large language models to specific tasks using reinforcement learning, reducing manual effort and improving performance.

Contribution

It presents an end-to-end data recipe generation method for LLM adaptation, leveraging reinforcement learning to automate and optimize data curation for target tasks.

Findings

01

DataChef-32B achieves performance comparable to human-curated recipes.

02

Automated recipes improve LLM adaptation to specific domains like math.

03

The system surpasses official checkpoints in certain tasks.

Abstract

In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the \emph{data recipe}, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the growing use of LLMs to automate individual data processing steps, such as data synthesis and filtering, the overall design of data recipes remains largely manual and labor-intensive, requiring substantial human expertise and iteration. To bridge this gap, we formulate \emph{end-to-end data recipe generation} for LLM adaptation. Given a target benchmark and a pool of available data sources, a model is required to output a complete data recipe that adapts a base LLM to the target task. We present DataChef-32B, which performs online reinforcement learning using a proxy reward that predicts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
yichengchen24/DataChef-32B
model· 8 dl· ♡ 3
8 dl♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education