Data Deletion Can Help in Adaptive RL

Param Budhraja; Aditya Gangrade; Alex Olshevsky; Venkatesh Saligrama

arXiv:2605.00298·cs.LG·May 4, 2026

Data Deletion Can Help in Adaptive RL

Param Budhraja, Aditya Gangrade, Alex Olshevsky, Venkatesh Saligrama

PDF

TL;DR

This paper demonstrates that random data deletion during training improves the robustness of reinforcement learning policies in time-varying environments by implicitly managing data distribution mismatch.

Contribution

It introduces a simple data deletion trick that enhances estimator robustness and provides theoretical analysis of when deletion is beneficial under distribution mismatch.

Findings

01

Data deletion reduces robustness gap by 30% for MLPs.

02

Deletion allows smaller models to outperform larger ones trained without deletion.

03

Theoretical analysis shows deletion helps when the distribution mismatch and SNR are sufficiently low.

Abstract

Deploying reinforcement learning policies in the real world requires adapting to time-varying environments. We study this problem in the contextual Markov Decision Process (cMDP) framework, where a family of environments is indexed by a low-dimensional context unknown at test time. The standard approach decomposes the problem: train a so-called "universal policy" which assumes knowledge of the true context, then pair it with a context estimator which approximates context using the observed trajectory. We identify a simple, counterintuitive trick that substantially improves the estimator: randomly delete a fraction of the training buffer after each round. This works because data is collected across multiple rounds using progressively better policies, and older trajectories come from a different distribution than what the estimator will face at deployment time; random deletion creates an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.