SelfHeal: Empirical Fix Pattern Analysis and Bug Repair in LLM Agents

Niful Islam; Muhammad Anas Raza; Mohammad Wardat

arXiv:2604.17699·cs.SE·April 21, 2026

SelfHeal: Empirical Fix Pattern Analysis and Bug Repair in LLM Agents

Niful Islam, Muhammad Anas Raza, Mohammad Wardat

PDF

TL;DR

This paper presents an empirical study of bug fix patterns in LLM agents, introduces a new bug dataset, and proposes SelfHeal, a multi-agent system that effectively repairs bugs in LLM agents.

Contribution

It is the first to analyze bug fix patterns in LLM agents, create a benchmark dataset, and develop a multi-agent repair system that outperforms existing methods.

Findings

01

SelfHeal significantly outperforms baseline approaches.

02

The dataset contains 37 runtime buggy instances with fixes.

03

Bug fix patterns vary across platforms and languages.

Abstract

Large Language Models (LLMs) have transformed software development and AI applications. While LLMs are designed for text processing, LLM agents extend this capability by enabling autonomous actions, tool use, and multi-step task completion. As this field grows, developers face new challenges in debugging these complex systems. To address this challenge, we present the first empirical study on bug fix patterns in LLM agents. We study buggy posts and code snippets from three platforms: Stack Overflow, GitHub, and HuggingFace Forums. We examine their fix patterns, the components where fixes are applied, and the programming languages and frameworks involved. Furthermore, we introduce AgentDefect, the first benchmark dataset for bugs in LLM agents. The dataset contains 37 runtime buggy instances along with fixed code and test files. Finally, we present SelfHeal, a multi-agent system designed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.