Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning

Manav Vora; Jonas Liang; Michael N. Grussing; Melkior Ornik

arXiv:2408.07192·cs.LG·September 17, 2025

Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning

Manav Vora, Jonas Liang, Michael N. Grussing, Melkior Ornik

PDF

Open Access

TL;DR

This paper introduces a scalable two-step approach combining approximate budget allocation and oracle-guided meta-reinforcement learning to solve large-scale, budget-constrained monotonic POMDPs efficiently, demonstrated on a maintenance scenario.

Contribution

It presents a novel scalable method for solving massive monotonic POMDPs by integrating approximate budget allocation with oracle-guided meta-RL, enabling practical solutions for complex real-world problems.

Findings

01

Method scales well with increasing components.

02

Effective in real-world maintenance scenario.

03

Computational complexity analysis confirms scalability.

Abstract

Monotonic Partially Observable Markov Decision Processes (POMDPs), where the system state progressively decreases until a restorative action is performed, can be used to model sequential repair problems effectively. This paper considers the problem of solving budget-constrained multi-component monotonic POMDPs, where a finite budget limits the maximal number of restorative actions. For a large number of components, solving such a POMDP using current methods is computationally intractable due to the exponential growth in the state space with an increasing number of components. To address this challenge, we propose a two-step approach. Since the individual components of a budget-constrained multi-component monotonic POMDP are only connected via the shared budget, we first approximate the optimal budget allocation among these components using an approximation of each component POMDP's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Parking Systems Research