PYTHALAB-MERA: Validation-Grounded Memory, Retrieval, and Acceptance Control for Frozen-LLM Coding Agents
Mehmet Iscan

TL;DR
PYTHALAB-MERA introduces an external validation-grounded controller for local LLM coding agents, enhancing validation success through episodic memory, adaptive retrieval, and delayed credit assignment in reinforcement learning tasks.
Contribution
It presents a novel lightweight external controller that improves validation success in local LLM coding agents by integrating memory, validation, and reward propagation mechanisms.
Findings
PYTHALAB-MERA passed 8/9 strict validations in the tested RL setting.
Baseline methods like self-refinement and GRACE passed 0/9 validations.
External memory-and-retrieval control improved validation success in recorded experiments.
Abstract
Local LLM-based coding agents increasingly work in settings where correctness is earned through execution feedback, persistent state, and bounded repair, not through a single fluent answer. Static retrieval, long-context prompting, self-refinement, execution-feedback repair, and reinforcement learning over model weights each address part of this setting, but they do not jointly provide validation-grounded episodic memory, adaptive retrieval-action selection, delayed credit assignment, and structural skill reuse around a frozen local model. We introduce PYTHALAB-MERA, a lightweight external controller for local validation-conditioned code generation. The frozen language model proposes complete source files; the controller decides which memory records and AST-derived skills should enter the next prompt, validates each candidate through a fail-fast pipeline, converts validation outcomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
