Matrix Low-Rank Approximation For Policy Gradient Methods

Sergio Rozada; Antonio G. Marques

arXiv:2405.17626·cs.LG·May 29, 2024

Matrix Low-Rank Approximation For Policy Gradient Methods

Sergio Rozada, Antonio G. Marques

PDF

Open Access 1 Repo

TL;DR

This paper introduces low-rank matrix models for policy gradient methods in reinforcement learning, offering an efficient alternative to neural networks by reducing computational and sample complexities while maintaining performance.

Contribution

It proposes a novel low-rank matrix approach to estimate policy parameters, addressing challenges of neural network architectures and convergence in policy gradient methods.

Findings

01

Low-rank matrix models reduce computational complexity.

02

Sample efficiency improves with the matrix approach.

03

Performance comparable to neural network-based policies.

Abstract

Estimating a policy that maps states to actions is a central problem in reinforcement learning. Traditionally, policies are inferred from the so called value functions (VFs), but exact VF computation suffers from the curse of dimensionality. Policy gradient (PG) methods bypass this by learning directly a parametric stochastic policy. Typically, the parameters of the policy are estimated using neural networks (NNs) tuned via stochastic gradient descent. However, finding adequate NN architectures can be challenging, and convergence issues are common as well. In this paper, we put forth low-rank matrix-based models to estimate efficiently the parameters of PG algorithms. We collect the parameters of the stochastic policy into a matrix, and then, we leverage matrix-completion techniques to promote (enforce) low rank. We demonstrate via numerical studies how low-rank matrix-based policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sergiorozada12/matrix-low-rank-pg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Age of Information Optimization