SLMFix: Leveraging Small Language Models for Error Fixing with Reinforcement Learning
David Jiahao Fu, Aryan Gupta, Aaron Councilman, David Grove, Yu-Xiong Wang, Vikram Adve

TL;DR
SLMFix introduces a reinforcement learning-based pipeline leveraging small language models to effectively fix syntactic errors in code generated by large models, especially for low-resource languages, improving code quality without extensive finetuning.
Contribution
The paper presents a novel approach using RL to finetune small language models for program repair, outperforming supervised finetuning for low-resource programming languages.
Findings
Achieves over 95% pass rate on static validators.
Outperforms supervised finetuning on 7B models.
Demonstrates effectiveness across multiple domain-specific languages.
Abstract
Recent advancements in large language models (LLMs) have shown very impressive capabilities in code generation across many programming languages. However, even state-of-the-art LLMs generate programs that contains syntactic errors and fail to complete the given tasks, especially for low-resource programming languages (LRPLs). In addition, high training cost makes finetuning LLMs unaffordable with constrained computational resources, further undermining the effectiveness of LLMs for code generation. In this work, we propose SLMFix, a novel code generation pipeline that leverages a small language model (SLM) finetuned using reinforcement learning (RL) techniques to fix syntactic errors in LLM-generated programs to improve the quality of LLM-generated programs for domain-specific languages (DSLs). In specific, we applied RL on the SLM for the program repair task using a reward calculated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software Testing and Debugging Techniques
