Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

Mahdi Kallel; Johannes T\"olle; Ahmed Hendawy; Carlo D'Eramo

arXiv:2604.22110·cs.LG·April 27, 2026

Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

Mahdi Kallel, Johannes T\"olle, Ahmed Hendawy, Carlo D'Eramo

PDF

TL;DR

The paper introduces Reinforced Iterative Classification (RIC), a reinforcement learning-based method that iteratively refines predictions, offering adaptive computation, improved calibration, and matching accuracy of traditional supervised models.

Contribution

RIC replaces single-pass imitation with RL, enabling iterative prediction refinement, adaptive computation, and better calibration in classification tasks.

Findings

01

RIC matches supervised accuracy on image benchmarks.

02

RIC provides an anytime classifier with adaptive computation.

03

RIC improves calibration over traditional models.

Abstract

Standard supervised classification trains models to imitate the exact labels provided by a perfect oracle. This imitation happens in a single pass, restricting the model to a fixed compute budget even when inputs vary in complexity. Moreover, the rigid training objective forces the model to express absolute certainty on its training data, resulting in overconfident predictions during evaluation. We propose Reinforced Iterative Classification (RIC), which replaces the imitative objective with Reinforcement Learning (RL). RIC deploys a recurrent agent that iteratively updates a predictive distribution over classes, receiving reward for stepwise improvement in prediction quality. The value function provides a natural halting criterion by estimating the remaining scope for improvement. We prove that the iterative formulation recovers the same optimal predictions as cross-entropy while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.