COOL-MC: Verifying and Explaining RL Policies for Multi-bridge Network Maintenance
Dennis Gross

TL;DR
This paper introduces COOL-MC, a tool that verifies and explains reinforcement learning policies for multi-bridge network maintenance, providing formal safety guarantees and interpretability for infrastructure management.
Contribution
We extend a single-bridge MDP to a multi-bridge network, applying probabilistic model checking and explainability to verify safety and interpret policies.
Findings
Policy has 3.5% safety violation probability
Revealed bias towards bridge 1 in the learned policy
Demonstrated COOL-MC's effectiveness in analysis
Abstract
Aging bridge networks require proactive, verifiable, and interpretable maintenance strategies, yet reinforcement learning (RL) policies trained solely on reward signals provide no formal safety guarantees and remain opaque to infrastructure managers. We demonstrate COOL-MC as a tool for verifying and explaining RL policies for multi-bridge network maintenance, building on a single-bridge Markov decision process (MDP) from the literature and extending it to a parallel network of three heterogeneous bridges with a shared periodic budget constraint, encoded in the PRISM modeling language. We train an RL agent on this MDP and apply probabilistic model checking and explainability methods to the induced discrete-time Markov chain (DTMC) that arises from the interaction between the learned policy and the underlying MDP. Probabilistic model checking reveals that the trained policy has a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Occupational Health and Safety Research · Concrete Corrosion and Durability
