
TL;DR
This paper investigates stability and convergence issues in reinforcement learning by analyzing a linear quadratic example, interpreting Q-learning convergence through monotone schemes, and discussing the impact of function approximation.
Contribution
It introduces a monotone scheme perspective to understand Q-learning convergence and explores how function approximation affects stability.
Findings
Q-learning convergence can be viewed as a monotone scheme.
Function approximation influences the monotonicity and stability of Q-learning.
Insights into stability issues in deep reinforcement learning.
Abstract
Stability issues with reinforcement learning methods persist. To better understand some of these stability and convergence issues involving deep reinforcement learning methods, we examine a simple linear quadratic example. We interpret the convergence criterion of exact Q-learning in the sense of a monotone scheme and discuss consequences of function approximation on monotonicity properties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition
MethodsQ-Learning
