An Investigation of Offline Reinforcement Learning in Factorisable Action Spaces
Alex Beeson, David Ireland, Giovanni Montana

TL;DR
This paper explores offline reinforcement learning in factorisable discrete action spaces, addressing the gap in existing research by evaluating adapted techniques and providing new benchmarks and datasets for the community.
Contribution
It introduces a factorised approach to offline RL, conducts extensive empirical evaluation, and provides publicly available datasets and code for future research.
Findings
Factorised action spaces can be effectively handled with adapted offline RL techniques.
The introduced datasets vary in quality and complexity, supporting diverse research needs.
Empirical results highlight the strengths and limitations of current methods in this setting.
Abstract
Expanding reinforcement learning (RL) to offline domains generates promising prospects, particularly in sectors where data collection poses substantial challenges or risks. Pivotal to the success of transferring RL offline is mitigating overestimation bias in value estimates for state-action pairs absent from data. Whilst numerous approaches have been proposed in recent years, these tend to focus primarily on continuous or small-scale discrete action spaces. Factorised discrete action spaces, on the other hand, have received relatively little attention, despite many real-world problems naturally having factorisable actions. In this work, we undertake a formative investigation into offline reinforcement learning in factorisable action spaces. Using value-decomposition as formulated in DecQN as a foundation, we present the case for a factorised approach and conduct an extensive empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsFocus
