State-free Reinforcement Learning
Mingyu Chen, Aldo Pacchiano, Xuezhou Zhang

TL;DR
This paper introduces a state-free reinforcement learning algorithm that operates without prior knowledge of the environment's state space, aiming for hyper-parameter free RL.
Contribution
It presents the first algorithm for state-free RL with regret bounds independent of the entire state space, advancing towards parameter-free RL.
Findings
Regret depends only on the reachable state set, not the entire state space.
Algorithm requires no prior state information.
Progress towards hyper-parameter free reinforcement learning.
Abstract
In this work, we study the \textit{state-free RL} problem, where the algorithm does not have the states information before interacting with the environment. Specifically, denote the reachable state set by , we design an algorithm which requires no information on the state space while having a regret that is completely independent of and only depend on . We view this as a concrete first step towards \textit{parameter-free RL}, with the goal of designing RL algorithms that require no hyper-parameter tuning.
Peer Reviews
Decision·NeurIPS 2024 poster
* The algorithmic solution is quite elegant since it can be applied to any "basic" RL algorithm with regret guarantees. * The final result achieves the desired removal of the dependency on S, which is replaced by the size of the reachable states. * The result holds for both stochastic and adversarial settings and it can be extended to removing the dependency on the horizon H as well.
* I would encourage the authors to provide a clean comparison of the final bounds in the stochastic setting with the best available bounds. In particular, I'm wondering whether the restart leads to extra log terms. * Related the previous point, I suggest the authors to make explicit the bounds for simple doubling trick strategies, so as to have a point of comparison. * What is exactly the role of epsilon? It looks like it can be directly set to 0 and everything works the same. Additional refere
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Reinforcement Learning in Robotics
MethodsSparse Evolutionary Training
