On-the-fly Improving Performance of Deep Code Models via Input Denoising

Zhao Tian; Junjie Chen; Xiangyu Zhang

arXiv:2308.09969·cs.SE·August 22, 2023

On-the-fly Improving Performance of Deep Code Models via Input Denoising

Zhao Tian, Junjie Chen, Xiangyu Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces CodeDenoise, a novel on-the-fly input denoising technique for deep code models that localizes and cleans noisy identifiers, significantly improving accuracy without retraining.

Contribution

It presents the first input denoising method for deep code models that enhances performance on deployed models without retraining or fine-tuning.

Findings

01

CodeDenoise denoises 21.91% of mispredicted inputs on average.

02

It improves model accuracy by 2.04% across multiple datasets.

03

The technique operates efficiently, averaging 0.48 seconds per input.

Abstract

Deep learning has been widely adopted to tackle various code-based tasks by building deep code models based on a large amount of code snippets. While these deep code models have achieved great success, even state-of-the-art models suffer from noise present in inputs leading to erroneous predictions. While it is possible to enhance models through retraining/fine-tuning, this is not a once-and-for-all approach and incurs significant overhead. In particular, these techniques cannot on-the-fly improve performance of (deployed) models. There are currently some techniques for input denoising in other domains (such as image processing), but since code input is discrete and must strictly abide by complex syntactic and semantic constraints, input denoising techniques in other fields are almost not applicable. In this work, we propose the first input denoising technique (i.e., CodeDenoise) for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tianzhaotju/codedenoise
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Software Engineering Research · Anomaly Detection Techniques and Applications