$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis
Zishun Yu, Yunzhe Tao, Liyu Chen, Tao Sun, Hongxia Yang

TL;DR
This paper introduces $\\mathcal{B}$-Coder, a value-based deep reinforcement learning approach for program synthesis that leverages pre-trained language models and a conservative Bellman operator to achieve state-of-the-art results with minimal reward engineering.
Contribution
The work pioneers the application of value-based reinforcement learning to program synthesis, contrasting the dominant policy-based methods, and introduces techniques to mitigate training challenges.
Findings
Achieves state-of-the-art performance in program synthesis.
Requires minimal reward engineering.
Demonstrates the effectiveness of value-based RL over policy-based methods.
Abstract
Program synthesis aims to create accurate, executable programs from problem specifications, specifically from natural language descriptions in our context. Recent studies have leveraged the power of reinforcement learning (RL) in conjunction with large language models (LLMs), significantly enhancing code generation capabilities. The application of RL focuses on directly optimizing for functional correctness, offering an advantage over conventional supervised methods. Despite policy-based RL methods dominating the literature on RL for program synthesis, the nature of program synthesis tasks hints at a natural alignment with value-based methods. This stems from the rich collection of off-policy programs, including those developed by human programmers and also historical samples, coupled with the straightforward verification of generated programs through automated unit testing, meaning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Embedded Systems Design Techniques · VLSI and FPGA Design Techniques
