Myopic Policy Bounds for Information Acquisition POMDPs

Mikko Lauri; Nikolay Atanasov; George J. Pappas; Risto Ritala

arXiv:1601.07279·cs.SY·January 28, 2016·2 cites

Myopic Policy Bounds for Information Acquisition POMDPs

Mikko Lauri, Nikolay Atanasov, George J. Pappas, Risto Ritala

PDF

Open Access

TL;DR

This paper develops efficient methods to compute bounds on optimal policies for information gathering POMDPs, enabling faster decision-making in robotic sensing tasks with complex reward structures.

Contribution

It introduces a framework for deriving tight policy bounds in certain structured POMDPs, improving computational efficiency without approximating the entire policy.

Findings

01

Policy bounds are often tight and computationally efficient.

02

Branch-and-bound significantly accelerates optimal policy computation.

03

Method outperforms traditional value iteration in target tracking domain.

Abstract

This paper addresses the problem of optimal control of robotic sensing systems aimed at autonomous information gathering in scenarios such as environmental monitoring, search and rescue, and surveillance and reconnaissance. The information gathering problem is formulated as a partially observable Markov decision process (POMDP) with a reward function that captures uncertainty reduction. Unlike the classical POMDP formulation, the resulting reward structure is nonlinear in the belief state and the traditional approaches do not apply directly. Instead of developing a new approximation algorithm, we show that if attention is restricted to a class of problems with certain structural properties, one can derive (often tight) upper and lower bounds on the optimal policy via an efficient myopic computation. These policy bounds can be applied in conjunction with an online branch-and-bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Age of Information Optimization