$\ell_1$ Regularized Gradient Temporal-Difference Learning
Dominik Meyer, Hao Shen, Klaus Diepold

TL;DR
This paper introduces $ ext{L}_1$ regularized Gradient TD algorithms for stable, efficient off-policy learning with large feature sets, combining regularization with convergence guarantees and empirical validation.
Contribution
It develops a new family of $ ext{L}_1$ regularized GTD algorithms using soft thresholding, addressing overfitting and computational issues in high-dimensional settings.
Findings
Algorithms converge under certain conditions.
Regularization improves feature selection and stability.
Numerical experiments demonstrate effectiveness.
Abstract
In this paper, we study the Temporal Difference (TD) learning with linear value function approximation. It is well known that most TD learning algorithms are unstable with linear function approximation and off-policy learning. Recent development of Gradient TD (GTD) algorithms has addressed this problem successfully. However, the success of GTD algorithms requires a set of well chosen features, which are not always available. When the number of features is huge, the GTD algorithms might face the problem of overfitting and being computationally expensive. To cope with this difficulty, regularization techniques, in particular regularization, have attracted significant attentions in developing TD learning algorithms. The present work combines the GTD algorithms with regularization. We propose a family of regularized GTD algorithms, which employ the well known…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Advanced Adaptive Filtering Techniques · Sparse and Compressive Sensing Techniques
