Scaling Test-Driven Code Generation from Functions to Classes: An Empirical Study

Yunhao Liang; Ruixuan Ying; Shiwen Ni; Zhe Cui

arXiv:2602.03557·cs.SE·February 4, 2026

Scaling Test-Driven Code Generation from Functions to Classes: An Empirical Study

Yunhao Liang, Ruixuan Ying, Shiwen Ni, Zhe Cui

PDF

Open Access

TL;DR

This study extends test-driven code generation from functions to classes, demonstrating that an iterative TDD framework significantly improves class-level correctness and reliability across multiple large language models.

Contribution

The paper introduces a scalable TDD framework for class-level code generation, including a new evaluation dataset and empirical analysis across eight LLMs.

Findings

01

Class-level correctness improved by 12 to 26 points

02

Up to 71% of classes are fully correct after TDD

03

Requires only a small number of repairs on average

Abstract

Test-driven development (TDD) has been adopted to improve Large Language Model (LLM)-based code generation by using tests as executable specifications. However, existing TDD-style code generation studies are largely limited to function-level tasks, leaving class-level synthesis where multiple methods interact through shared state and call dependencies underexplored. In this paper, we scale test-driven code generation from functions to classes via an iterative TDD framework. Our approach first analyzes intra-class method dependencies to derive a feasible generation schedule, and then incrementally implements each method under method-level public tests with reflection-style execution feedback and bounded repair iterations. To support test-driven generation and rigorous class-level evaluation, we construct ClassEval-TDD, a cleaned and standardized variant of ClassEval with consistent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Model-Driven Software Engineering Techniques