To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing
Wei Cheng, Yongchang Cao, Chen Shen, Binhua Li, Jue Chen, Yongbin Li, Wei Hu

TL;DR
This paper introduces structure-aware diff formats and an adaptive editing strategy for LLM-based code editing, significantly improving efficiency and maintaining accuracy in long-code editing tasks.
Contribution
It proposes BlockDiff, FuncDiff, and AdaEdit, novel methods that enhance code editing efficiency by leveraging structured diff formats and adaptive token-efficient strategies.
Findings
AdaEdit reduces latency and cost by over 30% on long-code editing tasks.
Structure-aware diff formats improve generation naturalness and accuracy.
Adaptive format selection maintains full-code accuracy while enhancing efficiency.
Abstract
Large Language Models (LLMs) are increasingly used for code editing, yet the prevalent full-code generation paradigm suffers from severe efficiency bottlenecks, posing challenges for interactive coding assistants that demand low latency and cost. Despite the predominant focus on scaling model capabilities, the edit format itself has been largely overlooked in model training. In this paper, we begin with a systematic study of conventional diff formats and reveal that fragile offsets and fragmented hunks make generation highly unnatural for LLMs. To address it, we introduce BlockDiff and FuncDiff, two structure-aware diff formats that represent changes as block-level rewrites of syntactically coherent units such as control structures and functions. Furthermore, we propose AdaEdit, a general adaptive edit strategy that trains LLMs to dynamically choose the most token-efficient format…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
