Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning
Lionel Blond\'e, Alexandros Kalousis, St\'ephane Marchand-Maillet

TL;DR
This paper investigates how different levels of inductive bias affect offline reinforcement learning performance across various dataset qualities, proposing a modular framework to optimize bias injection and improve robustness.
Contribution
It introduces a modular importance-weighted regression framework that adapts inductive biases based on dataset quality, enhancing offline RL performance across diverse data scenarios.
Findings
Careless optimality biases hinder performance on sub-optimal data.
Balanced bias injection improves results across dataset qualities.
A modular framework enables flexible bias design for offline RL.
Abstract
The performance of state-of-the-art offline RL methods varies widely over the spectrum of dataset qualities, ranging from far-from-optimal random data to close-to-optimal expert demonstrations. We re-implement these methods to test their reproducibility, and show that when a given method outperforms the others on one end of the spectrum, it never does on the other end. This prevents us from naming a victor across the board. We attribute the asymmetry to the amount of inductive bias injected into the agent to entice it to posit that the behavior underlying the offline dataset is optimal for the task. Our investigations confirm that careless injections of such optimality inductive biases make dominant agents subpar as soon as the offline policy is sub-optimal. To bridge this gap, we generalize importance-weighted regression methods that have proved the most versatile across the spectrum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
