Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository
Ajinkya Deshpande, Anmol Agarwal, Shashank Shet, Arun Iyer, Aditya, Kanade, Ramakrishna Bairi, Suresh Parthasarathy

TL;DR
This paper introduces RepoClassBench, a benchmark for evaluating large language models in generating class-level code within real-world repositories, and proposes RRR, a tool-enhanced iterative reasoning approach that significantly improves performance.
Contribution
The paper presents a new benchmark for class-level code generation in repositories and a novel tool-augmented method that enhances LLMs' ability to handle complex, real-world software dependencies.
Findings
RRR outperforms existing models on RepoClassBench
Models struggle with repository-level context without specialized tools
Tool-enhanced reasoning improves code generation accuracy
Abstract
LLMs have demonstrated significant potential in code generation tasks, achieving promising results at the function or statement level across various benchmarks. However, the complexities associated with creating code artifacts like classes, particularly within the context of real-world software repositories, remain underexplored. Prior research treats class-level generation as an isolated task, neglecting the intricate dependencies & interactions that characterize real-world software environments. To address this gap, we introduce RepoClassBench, a comprehensive benchmark designed to rigorously evaluate LLMs in generating complex, class-level code within real-world repositories. RepoClassBench includes "Natural Language to Class generation" tasks across Java, Python & C# from a selection of repositories. We ensure that each class in our dataset not only has cross-file dependencies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment
