Improving the Robustness of Large Language Models for Code Tasks via Fine-tuning with Perturbed Data
Yang Liu, Armstrong Foundjem, Xingfang Wu, Heng Li, Foutse Khomh

TL;DR
This paper demonstrates that fine-tuning large language models with perturbed datasets enhances their robustness against input variations in code tasks, though it may slightly reduce overall performance.
Contribution
It introduces a systematic approach to improve LLM robustness for coding tasks by using character, word, and sentence-level perturbations during fine-tuning.
Findings
Robustness improves by 4-6% with perturbed fine-tuning.
Performance decreases by 1-3% pass@1 after robustness-focused fine-tuning.
Perturbed fine-tuning is especially effective for less robust models.
Abstract
Context: In the fast-paced evolution of software development, Large Language Models (LLMs) have become indispensable tools for tasks such as code generation, completion, analysis, and bug fixing. Ensuring the robustness of these models against potential vulnerabilities from handling diverse inputs is critical, as variations in input can lead to incorrect or insecure code outputs. Objective: This work aims to improve the robustness of LLMs for coding-related tasks against potential adversarial inputs. Specifically, we investigate how fine-tuning LLMs with perturbed datasets impacts their robustness against input perturbations. Method: We systematically evaluated LLM robustness by fine-tuning models using datasets perturbed at character-level, word-level, and sentence-level, comparing results against base models and models fine-tuned on unperturbed datasets. Results: Fine-tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Topic Modeling
