On epsilon-optimality of the pursuit learning algorithm
Ryan Martin, Omkar Tilak

TL;DR
This paper provides a theoretical analysis of the pursuit learning algorithm, emphasizing the importance of a decreasing tuning parameter for convergence, and fills gaps in existing convergence proofs.
Contribution
It offers a rigorous proof of probabilistic convergence for pursuit learning with a vanishing tuning parameter, addressing a gap in prior theoretical work.
Findings
Convergence is guaranteed with a decreasing tuning parameter.
Existing proofs had gaps that are now filled.
Highlights the importance of parameter tuning in pursuit learning.
Abstract
Estimator algorithms in learning automata are useful tools for adaptive, real-time optimization in computer science and engineering applications. This paper investigates theoretical convergence properties for a special case of estimator algorithms: the pursuit learning algorithm. In this note, we identify and fill a gap in existing proofs of probabilistic convergence for pursuit learning. It is tradition to take the pursuit learning tuning parameter to be fixed in practical applications, but our proof sheds light on the importance of a vanishing sequence of tuning parameters in a theoretical convergence analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
