DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning
Yicheng Chen, Zerun Ma, Xinchen Xie, Yining Li, Kai Chen

TL;DR
This paper introduces DataChef, an automated system that generates optimal data processing recipes for adapting large language models to specific tasks using reinforcement learning, reducing manual effort and improving performance.
Contribution
It presents an end-to-end data recipe generation method for LLM adaptation, leveraging reinforcement learning to automate and optimize data curation for target tasks.
Findings
DataChef-32B achieves performance comparable to human-curated recipes.
Automated recipes improve LLM adaptation to specific domains like math.
The system surpasses official checkpoints in certain tasks.
Abstract
In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the \emph{data recipe}, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the growing use of LLMs to automate individual data processing steps, such as data synthesis and filtering, the overall design of data recipes remains largely manual and labor-intensive, requiring substantial human expertise and iteration. To bridge this gap, we formulate \emph{end-to-end data recipe generation} for LLM adaptation. Given a target benchmark and a pool of available data sources, a model is required to output a complete data recipe that adapts a base LLM to the target task. We present DataChef-32B, which performs online reinforcement learning using a proxy reward that predicts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
