Solving Truly Massive Budgeted Monotonic POMDPs with Oracle-Guided Meta-Reinforcement Learning
Manav Vora, Jonas Liang, Michael N. Grussing, Melkior Ornik

TL;DR
This paper introduces a scalable two-step approach combining approximate budget allocation and oracle-guided meta-reinforcement learning to solve large-scale, budget-constrained monotonic POMDPs efficiently, demonstrated on a maintenance scenario.
Contribution
It presents a novel scalable method for solving massive monotonic POMDPs by integrating approximate budget allocation with oracle-guided meta-RL, enabling practical solutions for complex real-world problems.
Findings
Method scales well with increasing components.
Effective in real-world maintenance scenario.
Computational complexity analysis confirms scalability.
Abstract
Monotonic Partially Observable Markov Decision Processes (POMDPs), where the system state progressively decreases until a restorative action is performed, can be used to model sequential repair problems effectively. This paper considers the problem of solving budget-constrained multi-component monotonic POMDPs, where a finite budget limits the maximal number of restorative actions. For a large number of components, solving such a POMDP using current methods is computationally intractable due to the exponential growth in the state space with an increasing number of components. To address this challenge, we propose a two-step approach. Since the individual components of a budget-constrained multi-component monotonic POMDP are only connected via the shared budget, we first approximate the optimal budget allocation among these components using an approximation of each component POMDP's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Parking Systems Research
