Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in   Reinforcement Learning

Sergio Rozada; Hoi-To Wai; Antonio G. Marques

arXiv:2501.04879·cs.LG·January 10, 2025

Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning

Sergio Rozada, Hoi-To Wai, Antonio G. Marques

PDF

Open Access 1 Repo

TL;DR

This paper introduces tensor low-rank policy models using PARAFAC decomposition to improve reinforcement learning efficiency, reducing computational and sample complexities while maintaining performance.

Contribution

It proposes a novel tensor low-rank approach for policy parameter estimation in RL, with theoretical guarantees and empirical validation showing advantages over neural networks.

Findings

01

Tensor low-rank policies reduce computational complexity.

02

They achieve similar rewards to neural networks.

03

Theoretical guarantees support the method's effectiveness.

Abstract

Reinforcement learning (RL) aims to estimate the action to take given a (time-varying) state, with the goal of maximizing a cumulative reward function. Predominantly, there are two families of algorithms to solve RL problems: value-based and policy-based methods, with the latter designed to learn a probabilistic parametric policy from states to actions. Most contemporary approaches implement this policy using a neural network (NN). However, NNs usually face issues related to convergence, architectural suitability, hyper-parameter selection, and underutilization of the redundancies of the state-action representations (e.g. locally similar states). This paper postulates multi-linear mappings to efficiently estimate the parameters of the RL policy. More precisely, we leverage the PARAFAC decomposition to design tensor low-rank policies. The key idea involves collecting the policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sergiorozada12/tensor-low-rank-pg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Energy Harvesting in Wireless Networks · Reinforcement Learning in Robotics