Optimistic Agents are Asymptotically Optimal
Peter Sunehag, Marcus Hutter

TL;DR
This paper introduces a class of reinforcement learning agents that use optimism to achieve asymptotic optimality across various environment classes, providing finite error bounds in deterministic cases.
Contribution
It presents a generic optimistic approach for asymptotically optimal reinforcement learning applicable to broad environment classes, with finite bounds in deterministic scenarios.
Findings
Achieves asymptotic optimality in diverse environments.
Provides finite error bounds for deterministic environments.
Demonstrates the effectiveness of optimism in reinforcement learning.
Abstract
We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Computability, Logic, AI Algorithms
