TDD Governance for Multi-Agent Code Generation via Prompt Engineering
Tarlan Hasanli, Shahbaz Siddeeq, Bishwash Khanal, Pyry Kotilainen, Tommi Mikkonen, Pekka Abrahamsson

TL;DR
This paper introduces a structured TDD framework for multi-agent code generation with LLMs, enforcing discipline and stability through prompt-level governance and layered architecture.
Contribution
It formalizes classical TDD principles into a machine-readable manifesto and integrates them into an architecture that improves LLM code generation reliability.
Findings
Enforces phase ordering and validation gates
Improves stability and reproducibility of LLM code generation
Separates proposal and authority layers for better control
Abstract
Large language models (LLMs) accelerate software development but often exhibit instability, non-determinism, and weak adherence to development discipline in unconstrained workflows. While test-driven development (TDD) provides a structured Red-Green-Refactor process, existing LLM-based approaches typically use tests as auxiliary inputs rather than enforceable process constraints. We present an AI-native TDD framework that operationalizes classical TDD principles as structured prompt-level and workflow-level governance mechanisms. Extracted principles are formalized in a machine-readable manifesto and distributed across planning, generation, repair, and validation stages within a layered architecture that separates model proposal from deterministic engine authority. The system enforces phase ordering, bounded repair loops, validation gates, and atomic mutation control to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
