TDFlow: Agentic Workflows for Test Driven Development

Kevin Han; Siddharth Maddikayala; Tim Knappe; Om Patel; Austen Liao; Amir Barati Farimani

arXiv:2510.23761·cs.SE·January 23, 2026

TDFlow: Agentic Workflows for Test Driven Development

Kevin Han, Siddharth Maddikayala, Tim Knappe, Om Patel, Austen Liao, Amir Barati Farimani

PDF

TL;DR

TDFlow is a novel test-driven workflow that decomposes software repair into specialized sub-agents, significantly improving test resolution success rates and approaching human-level performance in automated repository repair.

Contribution

Introduces TDFlow, a test-driven agentic workflow with specialized sub-agents for repository-scale software repair, reducing complexity and improving success rates over existing systems.

Findings

01

Achieves 88.8% pass rate on SWE-Bench Lite, 94.3% on SWE-Bench Verified

02

Reduces long-context burden on sub-agents, enabling focused task performance

03

Manual inspection shows minimal test hacking, indicating robustness

Abstract

We introduce TDFlow, a novel test-driven agentic workflow that frames repository-scale software engineering as a test-resolution task, specifically designed to solve human-written tests. Given a set of tests, TDFlow repeatedly proposes, revises, and debugs repository-scale patches using precisely engineered sub-agents and tightly constrained tools. The workflow decomposes software engineering program repair into four components governed by respective sub-agents. This simple, forced decoupling of patch proposing, debugging, patch revision, and optional test generation (1) reduces long-context burden on any individual sub-agent, (2) focuses each sub-agent on specific, pre-defined sub-tasks, and (3) allows for specialized performance improvement on specific sub-tasks. When provided human-written tests, TDFlow attains 88.8% pass rate on SWE-Bench Lite (an absolute improvement of 27.8% over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.