ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Juyong Jiang; Jiasi Shen; Sunghun Kim; Kang Min Yoo; Jeonghoon Kim; Sungju Kim

arXiv:2603.05863·cs.CL·April 21, 2026

ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Juyong Jiang, Jiasi Shen, Sunghun Kim, Kang Min Yoo, Jeonghoon Kim, Sungju Kim

PDF

1 Repo

TL;DR

ReflexiCoder introduces a reinforcement learning framework enabling large language models to self-reflect and self-correct code internally, achieving state-of-the-art performance on multiple benchmarks without external feedback.

Contribution

It presents a novel RL-based training paradigm that internalizes reasoning and correction processes into the model, reducing reliance on external tools and improving efficiency.

Findings

01

Achieves new SOTA on seven code generation benchmarks.

02

Reduces inference compute overhead by approximately 40%.

03

Outperforms or rivals proprietary models like GPT-5.1.

Abstract

While Large Language Models (LLMs) have revolutionized code generation, standard ``System 1'' approaches that generate solutions in a single forward pass often hit a performance ceiling on complex algorithmic tasks. Existing iterative refinement strategies attempt to bridge this gap at inference time, yet they predominantly rely on external oracles, execution feedback, or computationally expensive prompt-response cycles. In this work, we propose ReflexiCoder, a novel reinforcement learning (RL) framework that internalizes the structured reasoning trajectory, encompassing initial generation, bug and optimization aware reflection, and self-correction, directly into the model's weights. Unlike prior methods, ReflexiCoder shifts the paradigm from external-dependent refinement to an intrinsic, fully autonomous self-reflection and self-correction capabilities at inference time. We utilize an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

juyongjiang/ReflexiCoder
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.