Market-Based Reinforcement Learning in Partially Observable Worlds
Ivo Kwee, Marcus Hutter, Juergen Schmidhuber

TL;DR
This paper explores market-based reinforcement learning's application to partially observable environments, reimplementing a recent approach and evaluating its performance in a toy POMDP setting.
Contribution
It is the first to evaluate market-based RL in POMDPs, extending its applicability beyond reactive MDP environments.
Findings
Market-based RL can be applied to POMDPs.
Reimplementation of a recent approach demonstrates feasibility.
Initial evaluation shows promising results in toy settings.
Abstract
Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an agent needs to learn short-term memories of relevant previous events in order to execute optimal actions. Most previous work, however, has focused on reactive settings (MDPs) instead of POMDPs. Here we reimplement a recent approach to market-based RL and for the first time evaluate it in a toy POMDP setting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Advanced Bandit Algorithms Research
