SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?
Noam Tarshish, Nofar Selouk, Daniel Hodisan, Bar Ezra Gafniel, Yuval Elovici, Asaf Shabtai, Eliya Nachmani

TL;DR
SAFEdit introduces a multi-agent framework for instructed code editing that decomposes tasks into specialized roles, significantly improving success rates and reliability over single-agent methods on the EditBench benchmark.
Contribution
This work presents SAFEdit, a novel multi-agent system with explicit planning, editing, and verification components, enhancing instruction-driven code editing performance and interpretability.
Findings
SAFEdit achieved 68.6% task success rate, outperforming baselines.
Iterative refinement contributed 17.4 percentage points to success.
SAFEdit reduced instruction-level hallucinations compared to single-agent approaches.
Abstract
Instructed code editing is a significant challenge for large language models (LLMs). On the EditBench benchmark, 39 of 40 evaluated models obtain a task success rate (TSR) below 60 percent, highlighting a gap between general code generation and the ability to perform instruction-driven editing under executable test constraints. To address this, we propose SAFEdit, a multi-agent framework for instructed code editing that decomposes the editing process into specialized roles to improve reliability and reduce unintended code changes. A Planner Agent produces an explicit, visibility-aware edit plan, an Editor Agent applies minimal, literal code modifications, and a Verifier Agent executes real test runs. When tests fail, SAFEdit uses a Failure Abstraction Layer (FAL) to transform raw test logs into structured diagnostic feedback, which is fed back to the Editor to support iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
