On the Impacts of Contexts on Repository-Level Code Generation
Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui

TL;DR
This paper introduces RepoExec, a benchmark for evaluating repository-level code generation, emphasizing context utilization, correctness, and debugging, and presents findings on model performance with new datasets and metrics.
Contribution
It presents RepoExec, a novel benchmark with datasets and metrics for assessing repository-level code generation, focusing on context handling and functional correctness.
Findings
Pretrained LLMs excel in correctness.
Instruction-tuned models improve context utilization.
RepoExec effectively evaluates code functionality and developer intent alignment.
Abstract
CodeLLMs have gained widespread adoption for code generation tasks, yet their capacity to handle repository-level code generation with complex contextual dependencies remains underexplored. Our work underscores the critical importance of leveraging repository-level contexts to generate executable and functionally correct code. We present RepoExec, a novel benchmark designed to evaluate repository-level code generation, with a focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts. Our study examines a controlled scenario where developers specify essential code dependencies (contexts), challenging models to integrate them effectively. Additionally, we introduce an instruction-tuned dataset that enhances CodeLLMs' ability to leverage dependencies, along with a new metric, Dependency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Model-Driven Software Engineering Techniques
MethodsFocus
