Evaluating How Fine-tuning on Bimodal Data Effects Code Generation

Gabriel Orlanski; Seonhye Yang; Michael Healy

arXiv:2211.07842·cs.LG·November 16, 2022

Evaluating How Fine-tuning on Bimodal Data Effects Code Generation

Gabriel Orlanski, Seonhye Yang, Michael Healy

PDF

Open Access 1 Repo

TL;DR

Fine-tuning language models on bimodal data from coding forums improves code generation performance and reduces errors, but higher temperatures can decrease program runnability, highlighting the need for better data integration methods.

Contribution

This paper introduces a bimodal dataset from StackOverflow for fine-tuning models, demonstrating significant performance gains and error reduction in code generation tasks.

Findings

01

54.64% pass@k improvement on HumanEval

02

85.35% pass@k improvement on MBP tasks

03

Higher temperatures decrease program runnability

Abstract

Despite the increase in popularity of language models for code generation, it is still unknown how training on bimodal coding forums affects a model's code generation performance and reliability. We, therefore, collect a dataset of over 2.2M StackOverflow questions with answers for finetuning. These fine-tuned models have average $p a ss @ k$ improvements of 54.64% and 85.35% on the HumanEval (Chen et al., 2021) and Mostly Basic Program Problems (Austin et al., 2021) tasks, respectively. This regime further decreases the number of generated programs with both syntax and runtime errors. However, we find that at higher temperatures, there are significant decreases to the model's ability to generate runnable programs despite higher $p a ss @ k$ scores, underscoring the need for better methods of incorporating such data that mitigate these side effects. The code can be found…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gabeorlanski/bimodal-code-generation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Software Engineering Techniques and Practices