Finite Memory Belief Approximation for Optimal Control in Partially Observable Markov Decision Processes

Mintae Kim

arXiv:2601.03132·eess.SY·January 7, 2026

Finite Memory Belief Approximation for Optimal Control in Partially Observable Markov Decision Processes

Mintae Kim

PDF

Open Access

TL;DR

This paper develops a metric-based theory for finite memory belief approximation in partially observable stochastic control, quantifying how information loss affects control performance and validating the theory with LQG systems.

Contribution

It introduces a Wasserstein metric-based framework to analyze the impact of finite memory belief approximations on control performance in POMDPs, with explicit bounds and empirical validation.

Findings

01

Belief mismatch decays exponentially with memory length.

02

Performance degradation scales with belief mismatch.

03

The framework provides a metric-aware characterization of finite memory effects.

Abstract

We study finite memory belief approximation for partially observable (PO) stochastic optimal control (SOC) problems. While belief states are sufficient for SOC in partially observable Markov decision processes (POMDPs), they are generally infinite-dimensional and impractical. We interpret truncated input-output (IO) histories as inducing a belief approximation and develop a metric-based theory that directly relates information loss to control performance. Using the Wasserstein metric, we derive policy-conditional performance bounds that quantify value degradation induced by finite memory along typical closed-loop trajectories. Our analysis proceeds via a fixed-policy comparison: we evaluate two cost functionals under the same closed-loop execution and isolate the effect of replacing the true belief by its finite memory approximation inside the belief-level cost. For linear quadratic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization · Advanced Bandit Algorithms Research