A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation
Tom Erez, William D. Smart

TL;DR
This paper introduces a scalable local approximation method for solving high-dimensional continuous POMDPs by leveraging belief parameterization with Gaussian mixtures and EKF-based belief updates, enabling handling of larger domains.
Contribution
The paper presents a novel planning algorithm for high-dimensional continuous POMDPs that uses local optimization and belief approximation, significantly increasing scalability over existing methods.
Findings
Successfully applied to a 16-dimensional state, 6-dimensional action domain
Achieved scalable solutions for domains an order of magnitude larger than previous methods
Demonstrated effectiveness through simulated hand-eye coordination tasks
Abstract
Partially-Observable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding belief-MDP. In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces. Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods. We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update. Since the EKF is a first-order filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning. Local optimization provides no guarantees of global optimality, but it allows us to tackle…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Robotic Path Planning Algorithms
