COFFE: A Code Efficiency Benchmark for Code Generation
Yun Peng, Jun Wan, Yichen Li, Xiaoxue Ren

TL;DR
COFFE is a new benchmark designed to evaluate the time efficiency of large language model-generated code, addressing limitations of existing correctness-focused benchmarks by providing stable, stress-test cases and a novel efficiency metric.
Contribution
The paper introduces COFFE, a comprehensive benchmark with stress-test cases and a new efficiency metric, enabling accurate evaluation of code generation time performance.
Findings
Four key insights about LLM performance on COFFE
Identification of challenges in current time efficiency evaluation
Implications for future LLM research and application
Abstract
Code generation has largely improved development efficiency in the era of large language models (LLMs). With the ability to follow instructions, current LLMs can be prompted to generate code solutions given detailed descriptions in natural language. Many research efforts are being devoted to improving the correctness of LLM-generated code, and many benchmarks are proposed to evaluate the correctness comprehensively. Despite the focus on correctness, the time efficiency of LLM-generated code solutions is under-explored. Current correctness benchmarks are not suitable for time efficiency evaluation since their test cases cannot well distinguish the time efficiency of different code solutions. Besides, the current execution time measurement is not stable and comprehensive, threatening the validity of the time efficiency evaluation. To address the challenges in the time efficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Real-time simulation and control systems · Parallel Computing and Optimization Techniques
