Towards a Neural Debugger for Python
Maximilian Beck, Jonas Gehring, Jannik Kossen, Gabriel Synnaeve

TL;DR
This paper introduces neural debuggers based on large language models that emulate traditional debugging operations, enabling interactive control and improved program understanding and debugging capabilities.
Contribution
It presents the design and evaluation of neural debuggers that support debugger-like operations, a novel step towards interactive neural program interpreters.
Findings
Neural debuggers can reliably model forward and inverse execution.
Models achieve strong performance on output and input prediction tasks.
Neural debuggers enable conditional execution modeling.
Abstract
Training large language models (LLMs) on Python execution traces grounds them in code execution and enables the line-by-line execution prediction of whole Python programs, effectively turning them into neural interpreters (FAIR CodeGen Team et al., 2025). However, developers rarely execute programs step by step; instead, they use debuggers to stop execution at certain breakpoints and step through relevant portions only while inspecting or modifying program variables. Existing neural interpreter approaches lack such interactive control. To address this limitation, we introduce neural debuggers: language models that emulate traditional debuggers, supporting operations such as stepping into, over, or out of functions, as well as setting breakpoints at specific source lines. We show that neural debuggers -- obtained via fine-tuning large LLMs or pre-training smaller models from scratch --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Computational Physics and Python Applications · Software Testing and Debugging Techniques
