A Systematic Study of LLM-Based Architectures for Automated Patching

Qingxiao Xu; Ze Sheng; Zhicheng Chen; Jeff Huang

arXiv:2603.01257·cs.CR·March 3, 2026

A Systematic Study of LLM-Based Architectures for Automated Patching

Qingxiao Xu, Ze Sheng, Zhicheng Chen, Jeff Huang

PDF

Open Access

TL;DR

This study systematically compares four LLM-based automated patching architectures, revealing that design choices significantly impact effectiveness, efficiency, and robustness, with general-purpose code agents showing the best overall performance.

Contribution

It provides a controlled evaluation of different LLM patching architectures using a unified benchmark, highlighting the importance of architectural design over model capability.

Findings

01

Multi-agent systems improve generalization but have higher overhead.

02

General-purpose code agents outperform other architectures overall.

03

Fixed workflows are efficient but less robust.

Abstract

Large language models (LLMs) have shown promise for automated patching, but their effectiveness depends strongly on how they are integrated into patching systems. While prior work explores prompting strategies and individual agent designs, the field lacks a systematic comparison of patching architectures. In this paper, we present a controlled evaluation of four LLM-based patching paradigms -- fixed workflow, single-agent system, multi-agent system, and general-purpose code agents -- using a unified benchmark and evaluation framework. We analyze patch correctness, failure modes, token usage, and execution time across real-world vulnerability tasks. Our results reveal clear architectural trade-offs: fixed workflows are efficient but brittle, single-agent systems balance flexibility and cost, and multi-agent designs improve generalization at the expense of substantially higher overhead…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Software System Performance and Reliability · Software Reliability and Analysis Research