Sparse Offline Reinforcement Learning with Corruption Robustness

Nam Phuong Tran; Andi Nika; Goran Radanovic; Long Tran-Thanh; Debmalya Mandal

arXiv:2512.24768·stat.ML·May 13, 2026

Sparse Offline Reinforcement Learning with Corruption Robustness

Nam Phuong Tran, Andi Nika, Goran Radanovic, Long Tran-Thanh, Debmalya Mandal

PDF

TL;DR

This paper develops robust offline sparse reinforcement learning methods that handle high-dimensional data corruption, providing the first guarantees in such challenging settings.

Contribution

It introduces actor-critic algorithms with sparse robust estimators, overcoming limitations of standard methods and ensuring robustness under contamination in high-dimensional sparse MDPs.

Findings

01

Proposes actor-critic methods with sparse robust estimators.

02

Provides the first non-vacuous guarantees for sparse offline RL under contamination.

03

Extends results to settings with strong data corruption.

Abstract

We investigate robustness to strong data corruption in offline sparse reinforcement learning (RL). In our setting, an adversary may arbitrarily perturb a fraction of the collected trajectories from a high-dimensional but sparse Markov decision process, and our goal is to estimate a near optimal policy. The main challenge is that, in the high-dimensional regime where the number of samples $N$ is smaller than the feature dimension $d$ , exploiting sparsity is essential for obtaining non-vacuous guarantees but has not been systematically studied in offline RL. We analyse the problem under uniform coverage and sparse single-concentrability assumptions. While Least Square Value Iteration (LSVI), a standard approach for robust offline RL, performs well under uniform coverage, we show that integrating sparsity into LSVI is unnatural, and its analysis may break down due to overly pessimistic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.