POMDP inference and robust solution via deep reinforcement learning: An   application to railway optimal maintenance

Giacomo Arcieri; Cyprien Hoelzl; Oliver Schwery; Daniel Straub,; Konstantinos G. Papakonstantinou; Eleni Chatzi

arXiv:2307.08082·cs.LG·July 18, 2023

POMDP inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance

Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub,, Konstantinos G. Papakonstantinou, Eleni Chatzi

PDF

1 Repo

TL;DR

This paper introduces a combined inference and deep reinforcement learning framework for solving POMDPs with uncertain models, demonstrated on railway maintenance planning, enhancing robustness and applicability in real-world scenarios.

Contribution

It presents a novel approach integrating Bayesian inference with deep RL for POMDPs, including a hybrid model-based/model-free method and application to railway maintenance.

Findings

01

The framework effectively infers POMDP parameters from data.

02

Deep RL solutions are robust to model uncertainty via domain randomization.

03

Hybrid approaches outperform purely model-free methods in the application.

Abstract

Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the lack of availability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), require the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

giarcieri/robust-optimal-maintenance-planning-through-reinforcement-learning-and-rllib
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.