Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement   Learning

Stefan Stojanovic; Yassir Jedra; Alexandre Proutiere

arXiv:2310.06793·cs.LG·October 31, 2023

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning

Stefan Stojanovic, Yassir Jedra, Alexandre Proutiere

PDF

Open Access 1 Video

TL;DR

This paper introduces spectral entry-wise matrix estimation methods tailored for low-rank reinforcement learning problems, enabling improved algorithms for bandits and MDPs with theoretical guarantees.

Contribution

It demonstrates that spectral methods can effectively recover low-rank matrices with low entry-wise error in RL settings, leading to new algorithms with optimal performance guarantees.

Findings

01

Spectral methods recover singular subspaces efficiently.

02

Entry-wise error is nearly minimal with these methods.

03

Algorithms achieve state-of-the-art guarantees in RL tasks.

Abstract

We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP. In both cases, each entry of the matrix carries important information, and we seek estimation methods with low entry-wise error. Importantly, these methods further need to accommodate for inherent correlations in the available data (e.g. for MDPs, the data consists of system trajectories). We investigate the performance of simple spectral-based matrix estimation approaches: we show that they efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error. These new results on low-rank matrix estimation make it possible to devise reinforcement learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Age of Information Optimization