New Sufficient Conditions for Lower Bounding the Optimal Policy of a   POMDP using Lehmann Precision

Vikram Krishnamurthy

arXiv:1806.08733·cs.SY·October 23, 2018

New Sufficient Conditions for Lower Bounding the Optimal Policy of a POMDP using Lehmann Precision

Vikram Krishnamurthy

PDF

Open Access

TL;DR

This paper introduces new sufficient conditions, Lehmann precision and copositive dominance, that improve lower bounds on optimal policies in POMDPs, especially in controlled sensing scenarios.

Contribution

It proposes Lehmann precision and copositive dominance as novel conditions that address limitations of previous assumptions in bounding POMDP policies.

Findings

01

Lehmann precision exploits convexity and monotonicity of the value function.

02

Numerical examples show Lehmann precision holds where Blackwell dominance does not.

03

Main results enhance policy bounds in controlled sensing applications.

Abstract

This paper provides new sufficient conditions so that the optimal policy of a partially observed Markov decision process (POMDP) can be lower bounded by a myopic policy. The two new proposed conditions, namely, Lehmann precision and copositive dominance, completely fix the problems with two crucial assumptions in the well known papers of Lovejoy 1987 and Rieder 1991. For controlled sensing POMDPs, Lehmann precision exploits both convexity and monotonicity of the value function, whereas the classical Blackwell dominance only exploits convexity. Numerical examples are presented where Lehmann precision holds but Blackwell dominance does not hold, thereby illustrating the usefulness of the main result in controlled sensing applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Optimization and Search Problems · Advanced Bandit Algorithms Research