From Symbolic Tasks to Code Generation: Diversification Yields Better Task Performers
Dylan Zhang, Justin Wang, Francois Charton

TL;DR
This paper demonstrates that diversifying instruction-tuning data across a broad semantic space enhances large language models' ability to generalize and follow instructions, especially in code generation tasks.
Contribution
It introduces a theoretical framework using Markov algorithms to analyze instruction tuning and shows that diversity in training tasks improves model robustness and generalization.
Findings
Diverse instruction sets improve generalization in synthetic experiments.
Increased task diversity enhances code generation performance.
Robustness to distribution shifts increases with instruction set diversity.
Abstract
Instruction tuning -- tuning large language models on instruction-output pairs -- is a promising technique for making models better adapted to the real world. Yet, the key factors driving the model's capability to understand and follow instructions not seen during training remain under-explored. Our investigation begins with a series of synthetic experiments within the theoretical framework of a Turing-complete algorithm called Markov algorithm, which allows fine-grained control over the instruction-tuning data. Generalization and robustness with respect to the training distribution emerge once a diverse enough set of tasks is provided, even though very few examples are provided for each task. We extend these initial results to a real-world application scenario of code generation and find that a more diverse instruction set, extending beyond code-related tasks, improves the performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsSparse Evolutionary Training
