Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms

Rui Yan; Gabriel Santos; Gethin Norman; David Parker; Marta; Kwiatkowska

arXiv:2306.17639·eess.SY·August 8, 2024

Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms

Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta, Kwiatkowska

PDF

Open Access

TL;DR

This paper introduces a novel approach for solving continuous-state POMDPs with neural perception, using a piecewise linear convex representation and two value iteration algorithms, enabling formal policy synthesis in neural-symbolic systems.

Contribution

It proposes a new P-PWLC representation for NS-POMDPs and extends Bellman backups, along with two value iteration algorithms, including an approximate method, for continuous-state neural perception models.

Findings

01

Demonstrates the approach on case studies with trained neural perception networks.

02

Proves convexity and continuity of the value functions in the proposed model.

03

Shows the practical applicability of the algorithms in synthesizing near-optimal policies.

Abstract

The increasing trend to integrate neural networks and conventional software components in safety-critical settings calls for methodologies for their formal modelling, verification and correct-by-construction policy synthesis. We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs), a variant of continuous-state POMDPs with discrete observations and actions, in which the agent perceives a continuous-state environment using a neural {\revise perception mechanism} and makes decisions symbolically. The perception mechanism classifies inputs such as images and sensor values into symbolic percepts, which are used in decision making. We study the problem of optimising discounted cumulative rewards for NS-POMDPs. Working directly with the continuous state space, we exploit the underlying structure of the model and the neural perception mechanism to propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference