HELO-APR: Enhancing Low-Resource Program Repair through Cross-Lingual Knowledge Transfer
Zhipeng Wang, Boyang Yang, Yidong Wan, Liuye Guo, You Lv, Tao Zheng, Zhuowei Wang, Tieke He

TL;DR
HELO-APR is a two-stage framework that enhances low-resource program repair by transferring knowledge from high-resource languages using synthesized data and curriculum learning, significantly improving repair effectiveness.
Contribution
The paper introduces HELO-APR, a novel cross-lingual transfer method that synthesizes training data and employs curriculum learning to improve low-resource language program repair.
Findings
HELO-APR increases Pass@1 from 31.32% to 48.65% on DeepSeek-Coder-6.7B.
It raises the average target compilation rate on CodeLlama from 49.77% to 91.98%.
It improves BLEU-4 and ROUGE-1 scores on Defects4Ruby, indicating better patch similarity.
Abstract
Large Language Models (LLMs) perform well on automatic program repair (APR) for high-resource programming languages (HRPLs), but their effectiveness drops sharply in low-resource programming languages (LRPLs), due to a lack of sufficient verified buggy-fixed pairs for APR training. To address this challenge, we propose HELO-APR (High-resource Enabled LOw-resource APR), a two-stage APR framework that enables cross-lingual transfer of repair knowledge from HRPLs to LRPLs. HELO-APR (1) constructs high-quality LRPL training data by synthesizing LRPL buggy-fixed pairs from HRPL counterparts, preserving defect type consistency while ensuring the synthesized code is idiomatic, and then (2) adopts a curriculum learning strategy that progressively performs HRPL repair learning, cross-lingual repair alignment, and LRPL repair adaptation, improving repair effectiveness in LRPLs. Using C++ as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
