A Scalable Method for Solving High-Dimensional Continuous POMDPs Using   Local Approximation

Tom Erez; William D. Smart

arXiv:1203.3477·cs.AI·March 19, 2012·1 cites

A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation

Tom Erez, William D. Smart

PDF

Open Access

TL;DR

This paper introduces a scalable local approximation method for solving high-dimensional continuous POMDPs by leveraging belief parameterization with Gaussian mixtures and EKF-based belief updates, enabling handling of larger domains.

Contribution

The paper presents a novel planning algorithm for high-dimensional continuous POMDPs that uses local optimization and belief approximation, significantly increasing scalability over existing methods.

Findings

01

Successfully applied to a 16-dimensional state, 6-dimensional action domain

02

Achieved scalable solutions for domains an order of magnitude larger than previous methods

03

Demonstrated effectiveness through simulated hand-eye coordination tasks

Abstract

Partially-Observable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding belief-MDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods. We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update. Since the EKF is a first-order filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning. Local optimization provides no guarantees of global optimality, but it allows us to tackle…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Robotic Path Planning Algorithms