A Tractable POMDP for a Class of Sequencing Problems
Paat Rusmevichientong, Benjamin van Roy

TL;DR
This paper introduces a simplified POMDP model for sequencing problems that remains computationally feasible by reducing the state space, enabling effective grid-based solutions and providing error bounds, with an application to targeted advertising.
Contribution
It presents a tractable POMDP formulation for sequencing problems by state space reduction, allowing efficient solution methods and error analysis.
Findings
The reduced state space enables practical dynamic programming solutions.
An error bound for the approximation is established.
Application demonstrated in targeted advertising context.
Abstract
We consider a partially observable Markov decision problem (POMDP) that models a class of sequencing problems. Although POMDPs are typically intractable, our formulation admits tractable solution. Instead of maintaining a value function over a high-dimensional set of belief states, we reduce the state space to one of smaller dimension, in which grid-based dynamic programming techniques are effective. We develop an error bound for the resulting approximation, and discuss an application of the model to a problem in targeted advertising.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Auction Theory and Applications
