ACECode: A Reinforcement Learning Framework for Aligning Code Efficiency and Correctness in Code Language Models
Chengran Yang, Hong Jin Kang, Jieke Shi, David Lo

TL;DR
ACECode is a reinforcement learning framework that fine-tunes code language models to simultaneously improve code efficiency and correctness without relying on predefined test cases or execution environments.
Contribution
It introduces a novel reward-based fine-tuning method using reinforcement learning to optimize both efficiency and correctness of CodeLLMs.
Findings
Significant improvements in pass@1 accuracy across four SOTA CodeLLMs.
Reduction in runtime in 65% to 72% of cases compared to original models.
Outperforms baseline models like instruction-tuned and PIE-tuned CodeLLMs.
Abstract
CodeLLMs have demonstrated remarkable advancements in software engineering tasks. However, while these models can generate functionally correct code, they often produce code that is inefficient in terms of runtime. This inefficiency is particularly problematic in resource-constrained environments, impacting software performance and sustainability. Existing approaches for optimizing code efficiency for CodeLLMs like SOAP and PIE exhibit certain limitations. SOAP requires a compatible execution environment and predefined test cases for iterative code modification, while PIE focuses on instruction tuning, improving efficiency but compromising correctness. These shortcomings highlight the need for a fine-tuning framework that optimizes both efficiency and correctness without relying on predefined test cases or specific execution environments. To bridge this gap, we introduce ACECode, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Reliability and Analysis Research
