Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler

TL;DR
This paper compares model-free, model-based, and hybrid offline reinforcement learning algorithms on industrial benchmarks, revealing that simpler methods outperform hybrids in noisy, real-world-like settings.
Contribution
It provides a comprehensive comparison of different offline RL approaches on realistic industrial datasets, highlighting the challenges faced by hybrid methods.
Findings
Hybrid approaches struggle with real-world noisy data
Simpler algorithms like rollout-based perform best in complex environments
Model-free algorithms with regularizers are effective in practical scenarios
Abstract
Offline reinforcement learning (RL) Algorithms are often designed with environments such as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We compare model-free, model-based, as well as hybrid offline RL approaches on various industrial benchmark (IB) datasets to test the algorithms in settings closer to real world problems, including complex noise and partially observable states. We find that on the IB, hybrid approaches face severe difficulties and that simpler algorithms, such as rollout based algorithms or model-free algorithms with simpler regularizers perform best on the datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Multi-Objective Optimization Algorithms
