Understanding Deep Neural Function Approximation in Reinforcement Learning via $\epsilon$-Greedy Exploration
Fanghui Liu, Luca Viano, Volkan Cevher

TL;DR
This paper offers a theoretical analysis of deep neural network function approximation in reinforcement learning with epsilon-greedy exploration, focusing on neural architecture scaling and regret bounds.
Contribution
It provides the first theoretical insights into deep RL with epsilon-greedy exploration, analyzing neural network architecture requirements beyond linear models.
Findings
Scaling width as (T^{d/(2\u03b1+d)}) is sufficient for deep RL.
Scaling depth as (\,log T) suffices for deep neural networks.
Width (\,\, ext{sqrt}(T)) is enough for two-layer Barron space networks.
Abstract
This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL) with the -greedy exploration under the online setting. This problem setting is motivated by the successful deep Q-networks (DQN) framework that falls in this regime. In this work, we provide an initial attempt on theoretical understanding deep RL from the perspective of function class and neural networks architectures (e.g., width and depth) beyond the ``linear'' regime. To be specific, we focus on the value based algorithm with the -greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces, respectively, which aims at approximating an -smooth Q-function in a -dimensional feature space. We prove that, with episodes, scaling the width and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Reinforcement Learning in Robotics
