Effective Large Language Model Debugging with Best-first Tree Search

Jialin Song; Jonathan Raiman; Bryan Catanzaro

arXiv:2407.19055·cs.SE·July 30, 2024

Effective Large Language Model Debugging with Best-first Tree Search

Jialin Song, Jonathan Raiman, Bryan Catanzaro

PDF

Open Access

TL;DR

This paper introduces BESTER, a best-first tree search algorithm with self-reflection that significantly improves large language models' ability to debug code, achieving state-of-the-art results and providing insights into the debugging process.

Contribution

The paper presents a novel debugging algorithm for LLMs that combines best-first search with self-reflection, advancing the state-of-the-art in code generation accuracy.

Findings

01

Bester achieves state-of-the-art Pass@1 in three benchmarks.

02

Self-reflections improve bug detection and fixing.

03

Effectiveness of self-reflections varies with bug complexity.

Abstract

Large Language Models (LLMs) show promise in code generation tasks. However, their code-writing abilities are often limited in scope: while they can successfully implement simple functions, they struggle with more complex tasks. A fundamental difference with how an LLM writes code, compared to a human programmer, is that it cannot consistently spot and fix bugs. Debugging is a crucial skill for programmers and it enables iterative code refinement towards a correct implementation. In this work, we propose a novel algorithm to enable LLMs to debug their code via self-reflection and search where a model attempts to identify its previous mistakes. Our key contributions are 1) a best-first tree search algorithm with self-reflections (BESTER) that achieves state-of-the-art Pass@1 in three code generation benchmarks. BESTER maintains its superiority when we measure pass rates taking into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling