Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang, Yifan Wu, Ruslan Salakhutdinov, Sham M. Kakade

TL;DR
This paper empirically investigates the stability of offline reinforcement learning using pre-trained neural representations, revealing that significant error amplification occurs unless the data distribution closely matches the target policy, highlighting the need for stronger conditions.
Contribution
It provides an empirical analysis of offline RL stability with pre-trained features, demonstrating limitations and the necessity for mild distribution shifts for reliable performance.
Findings
Offline RL exhibits substantial error amplification with pre-trained representations.
Stability in offline RL is only achieved under extremely mild distribution shifts.
Strong representational conditions are required for successful offline RL beyond supervised learning conditions.
Abstract
In offline reinforcement learning (RL), we seek to utilize offline data to evaluate (or learn) policies in scenarios where the data are collected from a distribution that substantially differs from that of the target policy to be evaluated. Recent theoretical advances have shown that such sample-efficient offline RL is indeed possible provided certain strong representational conditions hold, else there are lower bounds exhibiting exponential error amplification (in the problem horizon) unless the data collection distribution has only a mild distribution shift relative to the target policy. This work studies these issues from an empirical perspective to gauge how stable offline RL methods are. In particular, our methodology explores these ideas when using features from pre-trained neural networks, in the hope that these representations are powerful enough to permit sample efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFormal Methods in Verification · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
