Robustness Analysis of POMDP Policies to Observation Perturbations
Benjamin Kraske, Qi Heng Ho, Federico Rossi, Morteza Lahijanian, Zachary Sunberg

TL;DR
This paper introduces a method to quantify and compute the maximum observation deviations a POMDP policy can tolerate while maintaining performance, with algorithms validated on large-scale problems.
Contribution
It formulates the Policy Observation Robustness Problem as a bi-level optimization and proposes Robust Interval Search algorithms with proven guarantees.
Findings
Efficient algorithms for robustness analysis in large POMDPs.
Polynomial time complexity for non-sticky observation deviations.
Validated scalability on problems with tens of thousands of states.
Abstract
Policies for Partially Observable Markov Decision Processes (POMDPs) are often designed using a nominal system model. In practice, this model can deviate from the true system during deployment due to factors such as calibration drift or sensor degradation, leading to unexpected performance degradation. This work studies policy robustness against deviations in the POMDP observation model. We introduce the Policy Observation Robustness Problem: to determine the maximum tolerable deviation in a POMDP's observation model that guarantees the policy's value remains above a specified threshold. We analyze two variants: the sticky variant, where deviations are dependent on state and actions, and the non-sticky variant, where they can be history-dependent. We show that the Policy Observation Robustness Problem can be formulated as a bi-level optimization problem in which the inner optimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
