ChronosPerseus: Randomized Point-based Value Iteration with Importance Sampling for POSMDPs
Richard Kohar, Fran\c{c}ois Rivest, Alain Gosselin

TL;DR
ChronosPerseus introduces a novel randomized value iteration algorithm for POSMDPs that incorporates importance sampling and continuous sojourn times, enabling efficient decision-making in complex, temporally uncertain environments.
Contribution
It extends the Perseus algorithm to POSMDPs by integrating continuous sojourn times and importance sampling, reducing complexity and handling temporal state information effectively.
Findings
Successfully applied to episodic bus problem
Effectively handled non-episodic maintenance problem
Reduced computational complexity through importance sampling
Abstract
In reinforcement learning, agents have successfully used environments modeled with Markov decision processes (MDPs). However, in many problem domains, an agent may suffer from noisy observations or random times until its subsequent decision. While partially observable Markov decision processes (POMDPs) have dealt with noisy observations, they have yet to deal with the unknown time aspect. Of course, one could discretize the time, but this leads to Bellman's Curse of Dimensionality. To incorporate continuous sojourn-time distributions in the agent's decision making, we propose that partially observable semi-Markov decision processes (POSMDPs) can be helpful in this regard. We extend \citet{Spaan2005a} randomized point-based value iteration (PBVI) \textsc{Perseus} algorithm used for POMDP to POSMDP by incorporating continuous sojourn time distributions and using importance sampling to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference
