Manifold Regularization for Kernelized LSTD

Xinyan Yan; Krzysztof Choromanski; Byron Boots; Vikas Sindhwani

arXiv:1710.05387·cs.LG·October 17, 2017·2 cites

Manifold Regularization for Kernelized LSTD

Xinyan Yan, Krzysztof Choromanski, Byron Boots, Vikas Sindhwani

PDF

Open Access

TL;DR

This paper introduces a manifold regularized kernelized approach for policy evaluation in reinforcement learning, leveraging the geometry of state space to improve sample efficiency and accuracy in Q-function approximation.

Contribution

It presents a novel manifold regularization technique integrated into kernelized policy evaluation, enhancing performance over traditional parametric methods.

Findings

01

Superior policy quality on benchmark tasks

02

Improved sample efficiency and accuracy

03

Effective use of intrinsic state space geometry

Abstract

Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL). It is a necessary component of policy iteration and can be used for variance reduction in policy gradient methods. Therefore its quality has a significant impact on most RL algorithms. Motivated by manifold regularized learning, we propose a novel kernelized policy evaluation method that takes advantage of the intrinsic geometry of the state space learned from data, in order to achieve better sample efficiency and higher accuracy in Q-function approximation. Applying the proposed method in the Least-Squares Policy Iteration (LSPI) framework, we observe superior performance compared to widely used parametric basis functions on two standard benchmarks in terms of policy quality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Numerical Analysis Techniques · Model Reduction and Neural Networks