A-ProS: Towards Reliable Autonomous Programming Through Multi-Model Feedback

Anika Tabassum; Md Sifat Hossain; Md. Fahim Arefin; Tariqul Islam; Tarannum Shaila Zaman

arXiv:2605.18073·cs.SE·May 19, 2026

A-ProS: Towards Reliable Autonomous Programming Through Multi-Model Feedback

Anika Tabassum, Md Sifat Hossain, Md. Fahim Arefin, Tariqul Islam, Tarannum Shaila Zaman

PDF

TL;DR

A-ProS is an autonomous AI system that enhances competitive programming solutions by integrating multi-model feedback and specialized debugging, significantly improving solution correctness and reliability.

Contribution

This paper introduces A-ProS, a novel multi-model feedback framework that combines large language models with debugging critics for improved autonomous code generation.

Findings

01

GPT-5 workflows increase accepted solutions from 39 to 85-90 after refinement.

02

Stateful refinement outperforms stateless approaches by 8.5-10.6 percentage points.

03

A-ProS achieves over 2x gains compared to baseline agent loops.

Abstract

Large Language Models (LLMs) demonstrate strong potential for automated code generation, yet their ability to iteratively refine solutions using execution feedback remains underexplored. Competitive programming offers an ideal testbed for this investigation, as it demands end-to-end algorithmic reasoning, precise implementation under strict computational constraints, and complete functional correctness with rigorous evaluation. In this paper, we present A-ProS, an autonomous AI agent that solves competitive programming problems through a hybrid multi-model feedback framework separating solution generation from specialized debugging. A-ProS combines ChatGPT-based generators (GPT-4 and GPT-5) with three debugging critics: Codestral-2508, Llama-3.3-70B, and DeepSeek-R1, under a 2 x 3 factorial design. We evaluate six workflows on 367 problems from ICPC World Finals (2011-2024) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.