RePaCA: Leveraging Reasoning Large Language Models for Static Automated Patch Correctness Assessment
Marcos Fuster-Pena, David de-Fitero-Dominguez, Antonio Garcia-Cabot, Eva Garcia-Lopez

TL;DR
RePaCA introduces a novel static patch correctness assessment method leveraging reasoning large language models, significantly improving accuracy, generalization, and explainability in identifying overfitting patches in automated program repair.
Contribution
The paper presents RePaCA, a static APCA technique that uses finetuned reasoning LLMs with Chain of Thought prompting and reinforcement learning, achieving state-of-the-art performance.
Findings
Achieves 83.1% accuracy and 84.8% F1-score on Defects4J-derived test.
Outperforms existing static APCA techniques in accuracy and generalization.
Provides enhanced explainability through reasoning-based assessment.
Abstract
Automated Program Repair (APR) seeks to automatically correct software bugs without requiring human intervention. However, existing tools tend to generate patches that satisfy test cases without fixing the underlying bug, those are known as overfitting patches. To address this issue, Automated Patch Correctness Assessment (APCA) attempts to identify overfitting patches generated by APR tools. It can be solved as a static approach, meaning that no additional information is needed beyond the original and fixed code snippets. Current static techniques often struggle with reliability, flexibility and transparency. To address these issues, we introduce RePaCA, a novel static APCA technique that leverages Large Language Models (LLMs) specialized in thinking tasks. Our model is prompted with both buggy and fixed code snippets and guided to generate a Chain of Thought that analyses code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
