Self-Execution Simulation Improves Coding Models
Gallil Maimon, Ori Yoran, Felix Kreuk, Michael Hassid, Gal Cohen, Pierre Chambon, Yossi Adi

TL;DR
This paper introduces a method for training language models to simulate program execution step-by-step, improving their ability to generate correct code and perform better on competitive programming tasks.
Contribution
It combines supervised fine-tuning with reinforcement learning on execution traces and introduces self-verification and self-fixing techniques for code generation.
Findings
Models show consistent improvements on competitive programming benchmarks.
Execution simulation enables self-verification of solutions.
Ablation studies highlight the importance and limitations of execution simulation.
Abstract
A promising research direction in enabling LLMs to generate consistently correct code involves addressing their inability to properly estimate program execution, particularly for code they generate. In this work, we demonstrate that Code LLMs can be trained to simulate program execution in a step-by-step manner and that this capability can be leveraged to improve competitive programming performance. Our approach combines supervised fine-tuning on natural language execution traces, textual explanations grounded in true execution, with reinforcement learning using verifiable rewards. We introduce two complementary objectives: output prediction given code and inputs, and solving competitive programming tasks with either ground-truth or self-predicted execution feedback. These objectives enable models to perform self-verification over multiple candidate solutions, and iterative self-fixing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
