Hindsight is Only 50/50: Unsuitability of MDP based Approximate POMDP   Solvers for Multi-resolution Information Gathering

Sankalp Arora; Sanjiban Choudhury; Sebastian Scherer

arXiv:1804.02573·cs.AI·April 10, 2018·5 cites

Hindsight is Only 50/50: Unsuitability of MDP based Approximate POMDP Solvers for Multi-resolution Information Gathering

Sankalp Arora, Sanjiban Choudhury, Sebastian Scherer

PDF

Open Access

TL;DR

This paper demonstrates that MDP-based approximate POMDP solvers are often unsuitable for multi-resolution information gathering tasks, due to their inability to motivate information gain, leading to sub-optimal solutions in certain conditions.

Contribution

The paper derives conditions under which MDP-based POMDP solvers are provably sub-optimal and illustrates their limitations using the tiger problem, guiding better solver design.

Findings

01

MDP-based POMDP solvers can fail in information gathering scenarios.

02

Multi-resolution, budgeted information gathering cannot be effectively addressed by these solvers.

03

The paper provides criteria to identify when MDP-based approaches are inappropriate.

Abstract

Partially Observable Markov Decision Processes (POMDPs) offer an elegant framework to model sequential decision making in uncertain environments. Solving POMDPs online is an active area of research and given the size of real-world problems approximate solvers are used. Recently, a few approaches have been suggested for solving POMDPs by using MDP solvers in conjunction with imitation learning. MDP based POMDP solvers work well for some cases, while catastrophically failing for others. The main failure point of such solvers is the lack of motivation for MDP solvers to gain information, since under their assumption the environment is either already known as much as it can be or the uncertainty will disappear after the next step. However for solving POMDP problems gaining information can lead to efficient solutions. In this paper we derive a set of conditions where MDP based POMDP solvers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Machine Learning and Algorithms