Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback

Seungjun Moon; Hyungjoo Chae; Yongho Song; Taeyoon Kwon; Dongjin Kang,; Kai Tzu-iunn Ong; Seung-won Hwang; Jinyoung Yeo

arXiv:2311.07215·cs.CL·February 26, 2024·2 cites

Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback

Seungjun Moon, Hyungjoo Chae, Yongho Song, Taeyoon Kwon, Dongjin Kang,, Kai Tzu-iunn Ong, Seung-won Hwang, Jinyoung Yeo

PDF

Open Access 1 Datasets

TL;DR

This paper introduces Coffee, a dataset and framework for improving open-source code LLMs' ability to generate accurate feedback for bug fixing, achieving state-of-the-art results on code correction benchmarks.

Contribution

The work presents a new dataset and a tuning framework that enhance open-source code LLMs' feedback quality for code editing tasks.

Findings

01

Coffee outperforms previous models on HumanEvalFix benchmark.

02

The framework reduces superficial and misleading feedback.

03

Open-source models can be improved for code fixing with proper tuning.

Abstract

Code editing is an essential step towards reliable program synthesis to automatically correct critical errors generated from code LLMs. Recent studies have demonstrated that closed-source LLMs (i.e., ChatGPT and GPT-4) are capable of generating corrective feedback to edit erroneous inputs. However, it remains challenging for open-source code LLMs to generate feedback for code editing, since these models tend to adhere to the superficial formats of feedback and provide feedback with misleading information. Hence, the focus of our work is to leverage open-source code LLMs to generate helpful feedback with correct guidance for code editing. To this end, we present Coffee, a collected dataset specifically designed for code fixing with feedback. Using this dataset, we construct CoffeePots, a framework for COde Fixing with FEEdback via Preference-Optimized Tuning and Selection. The proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

LangAGI-Lab/COFFEE-Dataset
dataset· 80 dl
80 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Machine Learning and Data Classification

MethodsFocus