Can a Learner Regret Using a No-Regret Algorithm? A Control-Theoretic Study of Performance Dominance

Hassan Abdelraouf; Jeff S. Shamma

arXiv:2603.03173·eess.SY·March 4, 2026

Can a Learner Regret Using a No-Regret Algorithm? A Control-Theoretic Study of Performance Dominance

Hassan Abdelraouf, Jeff S. Shamma

PDF

Open Access

TL;DR

This paper investigates whether some no-regret learning algorithms can outperform others across all environments, revealing that anticipatory replicator dynamics can globally dominate standard no-regret algorithms, thus challenging the notion of a 'free-lunch' in regret minimization.

Contribution

The paper introduces a control-theoretic framework to compare no-regret algorithms, demonstrating that anticipatory replicator dynamics can globally outperform standard no-regret methods.

Findings

01

Anticipatory RD globally dominates standard RD in all payoff environments.

02

A passivity-based approach is used for performance comparison.

03

Optimal control formulation shows zero minimal reward gap.

Abstract

No-regret learning dynamics ensure that a learner asymptotically achieves an average reward no worse than that of any fixed strategy. This no-regret guarantee does not determine the value of the asymptotic average reward. Indeed, it is possible for different no-regret learning dynamics to exhibit different asymptotic average rewards when facing the same environment while both assure the no-regret guarantee. This paper asks whether a "free-lunch" phenomenon can arise among no-regret algorithms. Namely, is it possible for one no-regret learning rule to uniformly outperform another no-regret learning rule across all payoff environments. Stated differently, can a learner regret not using a particular no-regret algorithm? We consider generalized replicator dynamics (RD) as a cascade interconnection between a linear time-invariant (LTI) system and the softmax nonlinearity. Varying this LTI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Game Theory and Applications