CodeRL: Mastering Code Generation through Pretrained Models and Deep   Reinforcement Learning

Hung Le; Yue Wang; Akhilesh Deepak Gotmare; Silvio Savarese; Steven; C.H. Hoi

arXiv:2207.01780·cs.LG·November 4, 2022·87 cites

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven, C.H. Hoi

PDF

Open Access 2 Repos 3 Models 1 Video

TL;DR

CodeRL integrates pretrained language models with deep reinforcement learning, using a critic network and feedback from unit tests to significantly improve program synthesis performance, especially on complex unseen tasks.

Contribution

The paper introduces CodeRL, a novel framework combining pretrained models and reinforcement learning with a critic network for improved code generation.

Findings

01

Achieves new SOTA on APPS benchmark.

02

Demonstrates strong zero-shot transfer on MBPP.

03

Enhances code generation with critic feedback and critical sampling.

Abstract

Program synthesis or code generation aims to generate a program that satisfies a problem specification. Recent approaches using large-scale pretrained language models (LMs) have shown promising results, yet they have some critical limitations. In particular, they often follow a standard supervised fine-tuning procedure to train a code generation model only from the pairs of natural-language problem descriptions and ground-truth programs. Such paradigm largely ignores some important but potentially useful signals in the problem specification such as unit tests, which thus often results in poor performance when solving complex unseen coding tasks. To address the limitations, we propose "CodeRL", a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning (RL). Specifically, during training, we treat the code-generating LM as an actor network, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning· slideslive

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Gated Linear Unit · Softmax · Multi-Head Attention · Residual Connection · SentencePiece · Attention Dropout · Dense Connections