Measuring Code Efficiency Optimization Capabilities with ACEOB
Yue Pan, Xiuting Shao, Chen Lyu

TL;DR
This paper introduces ACEOB, a novel benchmark dataset with over 95,000 Python code pairs, and new metrics to evaluate models' abilities to optimize code efficiency, revealing room for improvement even in advanced models.
Contribution
The paper presents ACEOB, the first dataset dedicated to Python code efficiency optimization, along with two novel metrics for assessing model performance in this task.
Findings
ACEOB contains 95,359 code pairs for benchmarking.
Fine-tuning models on ACEOB improves efficiency scores.
Even state-of-the-art models like ChatGPT show suboptimal performance.
Abstract
As Moore's Law gains diminish, software performance and efficiency become increasingly vital. Optimizing code efficiency is challenging, even for professional programmers. However, related research remains relatively scarce, and rigorously assessing models' abilities to optimize code efficiency is fraught with difficulties. In response to this challenge, we first conduct an in-depth analysis of "code patterns" in the model training dataset, meticulously exploring human-written code. Secondly, we define a task for optimizing code efficiency and introduce the Automatic Code Efficiency Optimization Benchmark (ACEOB), which consists of 95,359 pairs of efficient-inefficient code aimed at assessing code efficiency optimization capabilities. To our knowledge, ACEOB is the first dataset specifically targeting Python code efficiency optimization. To evaluate models' ability in optimizing code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Simulation Techniques and Applications · Parallel Computing and Optimization Techniques
