Repairing vulnerabilities without invisible hands. A differentiated replication study on LLMs
Maria Camporese, Fabio Massacci

TL;DR
This study investigates whether large language models' success in automated vulnerability repair is due to genuine understanding or hidden factors like data leakage, by deliberately misplacing fault locations and analyzing the models' responses.
Contribution
It provides a controlled replication of AVR studies, revealing the extent to which LLMs rely on memorization versus actual fault localization for fixing vulnerabilities.
Findings
LLMs often produce correct patches even with shifted fault locations.
The error rate in patches increases with larger localization shifts.
Results suggest that LLMs may rely on memorized fixes rather than true fault understanding.
Abstract
Background: Automated Vulnerability Repair (AVR) is a fast-growing branch of program repair. Recent studies show that large language models (LLMs) outperform traditional techniques, extending their success beyond code generation and fault detection. Hypothesis: These gains may be driven by hidden factors -- "invisible hands" such as training-data leakage or perfect fault localization -- that let an LLM reproduce human-authored fixes for the same code. Objective: We replicate prior AVR studies under controlled conditions by deliberately adding errors to the reported vulnerability location in the prompt. If LLMs merely regurgitate memorized fixes, both small and large localization errors should yield the same number of correct patches, because any offset should divert the model from the original fix. Method: Our pipeline repairs vulnerabilities from the Vul4J and VJTrans benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
