Generating refactored code accurately using reinforcement learning

Indranil Palit; Tushar Sharma

arXiv:2412.18035·cs.SE·December 25, 2024

Generating refactored code accurately using reinforcement learning

Indranil Palit, Tushar Sharma

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach to improve automated extract method refactoring in Java, enhancing code quality and correctness over traditional supervised methods.

Contribution

It presents a novel reinforcement learning framework that fine-tunes code language models for more accurate and semantically meaningful code refactoring tasks.

Findings

01

Reinforcement learning significantly improves code refactoring performance.

02

The approach increases successful functional tests from 41 to 66.

03

Models outperform supervised fine-tuning in BLEU and CodeBLEU metrics.

Abstract

Automated source code refactoring, particularly extract method refactoring, is a crucial and frequently employed technique during software development. Despite its importance and frequent use by practitioners, current automated techniques face significant limitations. These approaches often rely on developers to identify the precise bounds of refactoring opportunities in terms of source code statements. Also, they often do not capture the semantic context, resulting in offering no automated means to suggest meaningful method name, for instance. To address these challenges, we propose a novel reinforcement learning-based approach for fine-tuning and aligning code language models to perform automated, intelligent extract method refactoring on Java source code. Our approach fine-tunes sequence-to-sequence generative models and aligns them using the Proximal Policy Optimization (PPO)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research