SelfCodeAlign: Self-Alignment for Code Generation
Yuxiang Wei, Federico Cassano, Jiawei Liu, Yifeng Ding, Naman Jain,, Zachary Mueller, Harm de Vries, Leandro von Werra, Arjun Guha, Lingming Zhang

TL;DR
SelfCodeAlign introduces a transparent, self-supervised pipeline for aligning code LLMs, generating high-quality instruction-response data without human annotations, leading to state-of-the-art coding performance.
Contribution
It is the first fully transparent, permissive pipeline for self-aligning code LLMs without extensive human annotations or distillation.
Findings
Achieves 67.1 pass@1 on HumanEval+ with a 7B model.
Outperforms previous instruction tuning methods across benchmarks.
Creates StarCoder2-Instruct, a state-of-the-art, fully transparent code LLM.
Abstract
Instruction tuning is a supervised fine-tuning approach that significantly improves the ability of large language models (LLMs) to follow human instructions. We propose SelfCodeAlign, the first fully transparent and permissive pipeline for self-aligning code LLMs without extensive human annotations or distillation. SelfCodeAlign employs the same base model for inference throughout the data generation process. It first extracts diverse coding concepts from high-quality seed snippets to generate new tasks. It then samples multiple responses per task, pairs each with test cases, and validates them in a sandbox environment. Finally, passing examples are selected for instruction tuning. In our primary experiments, we use SelfCodeAlign with CodeQwen1.5-7B to generate a dataset of 74k instruction-response pairs. Finetuning on this dataset leads to a model that achieves a 67.1 pass@1 on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · Layer Normalization · Residual Connection · Linear Warmup With Cosine Annealing · Adam · Attention Dropout
