Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Leonidas Gee, Milan Gritta, Gerasimos Lampouras, Ignacio Iacobacci

TL;DR
Code-Optimise introduces a self-generated preference data framework for code language models that improves correctness and efficiency by dynamically balancing solution quality and runtime, leading to faster, cheaper, and more accurate code generation.
Contribution
It presents a lightweight, robust method that incorporates correctness and runtime signals for training, reducing overfitting and improving performance without larger models.
Findings
Significant improvements in pass@k metrics.
Reduced runtime by up to 6% in-domain and 3% out-of-domain.
Solution length decreased by up to 48% and 23%.
Abstract
Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a by-product, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSemantic Web and Ontologies · Model-Driven Software Engineering Techniques · Speech and dialogue systems
