The Impact of Data Distribution on Q-learning with Function Approximation
Pedro P. Santos, Diogo S. Carvalho, Alberto Sardinha, Francisco S., Melo

TL;DR
This paper investigates how the properties of data distribution affect the performance of Q-learning algorithms with function approximation, combining theoretical insights and empirical validation across different environments.
Contribution
It introduces a novel four-state MDP to analyze data distribution effects and provides a comprehensive theoretical and empirical study on offline Q-learning performance.
Findings
High entropy data distributions improve offline learning.
Data diversity and quality jointly enhance offline Q-learning.
Theoretical bounds connect data properties to algorithm performance.
Abstract
We study the interplay between the data distribution and Q-learning-based algorithms with function approximation. We provide a unified theoretical and empirical analysis as to how different properties of the data distribution influence the performance of Q-learning-based algorithms. We connect different lines of research, as well as validate and extend previous results. We start by reviewing theoretical bounds on the performance of approximate dynamic programming algorithms. We then introduce a novel four-state MDP specifically tailored to highlight the impact of the data distribution in the performance of Q-learning-based algorithms with function approximation, both online and offline. Finally, we experimentally assess the impact of the data distribution properties on the performance of two offline Q-learning-based algorithms under different environments. According to our results: (i)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Data Stream Mining Techniques
MethodsQ-Learning
