RePaCA: Leveraging Reasoning Large Language Models for Static Automated Patch Correctness Assessment

Marcos Fuster-Pena; David de-Fitero-Dominguez; Antonio Garcia-Cabot; Eva Garcia-Lopez

arXiv:2507.22580·cs.SE·July 31, 2025

RePaCA: Leveraging Reasoning Large Language Models for Static Automated Patch Correctness Assessment

Marcos Fuster-Pena, David de-Fitero-Dominguez, Antonio Garcia-Cabot, Eva Garcia-Lopez

PDF

TL;DR

RePaCA introduces a novel static patch correctness assessment method leveraging reasoning large language models, significantly improving accuracy, generalization, and explainability in identifying overfitting patches in automated program repair.

Contribution

The paper presents RePaCA, a static APCA technique that uses finetuned reasoning LLMs with Chain of Thought prompting and reinforcement learning, achieving state-of-the-art performance.

Findings

01

Achieves 83.1% accuracy and 84.8% F1-score on Defects4J-derived test.

02

Outperforms existing static APCA techniques in accuracy and generalization.

03

Provides enhanced explainability through reasoning-based assessment.

Abstract

Automated Program Repair (APR) seeks to automatically correct software bugs without requiring human intervention. However, existing tools tend to generate patches that satisfy test cases without fixing the underlying bug, those are known as overfitting patches. To address this issue, Automated Patch Correctness Assessment (APCA) attempts to identify overfitting patches generated by APR tools. It can be solved as a static approach, meaning that no additional information is needed beyond the original and fixed code snippets. Current static techniques often struggle with reliability, flexibility and transparency. To address these issues, we introduce RePaCA, a novel static APCA technique that leverages Large Language Models (LLMs) specialized in thinking tasks. Our model is prompted with both buggy and fixed code snippets and guided to generate a Chain of Thought that analyses code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.