CodeImprove: Program Adaptation for Deep Code Models

Ravishka Rathnasuriya; Zijie Zhao; Wei Yang

arXiv:2501.15804·cs.SE·June 18, 2025·2 cites

CodeImprove: Program Adaptation for Deep Code Models

Ravishka Rathnasuriya, Zijie Zhao, Wei Yang

PDF

Open Access

TL;DR

CodeImprove introduces a novel approach to adapt out-of-scope code inputs for deep learning models by using validation metrics and genetic algorithms, significantly improving model accuracy and input validation effectiveness.

Contribution

This paper presents a new method for program input adaptation that enhances deep code model performance without frequent retraining, using program transformation and validation techniques.

Findings

01

Up to 8.78% accuracy improvement in code models.

02

51.28% relative improvement in handling out-of-scope inputs.

03

High effectiveness in detecting out-of-scope inputs with AUC of 0.924.

Abstract

Leveraging deep learning (DL)-based code analysis tools to solve software engineering tasks is becoming increasingly popular. Code models often suffer performance degradation due to various reasons (e.g., code data shifts). Retraining is often required to address these issues, but frequent model updates are costly in labeling and deployment. In this paper, we explore an alternative solution: Adapting the program inputs to the code models. This can be achieved by two steps: 1) input validation that focuses on identifying whether an input is an out-of-scope input program that are beyond a model's handling capability, and 2) input adaptation that adapts out-of-scope inputs to become in-scope inputs. Validating program input is challenging, as current techniques focus on continuous inputs such as image data and fail with discrete inputs like code data, which have unique characteristics and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management