Corrected: On Confident Policy Evaluation for Factored Markov Decision Processes with Node Dropouts
Carmel Fiscko, Soummya Kar, Bruno Sinopoli

TL;DR
This paper develops a confidence-based importance sampling method for evaluating policies in factored MDPs that undergo node dropouts, enabling safe policy assessment without observing the new system.
Contribution
It introduces a robust importance sampling approach for structurally changing factored MDPs, allowing policy evaluation with high-confidence bounds before system modifications.
Findings
Method provides high-confidence policy value estimates after node dropout.
Approach outperforms Monte Carlo simulation in accuracy and efficiency.
Enables safe policy evaluation without direct observations of the modified system.
Abstract
In this work we investigate an importance sampling approach for evaluating policies for a structurally time-varying factored Markov decision process (MDP), i.e. the policy's value is estimated with a high-probability confidence interval. In particular, we begin with a multi-agent MDP controlled by a known policy but with unknown transition dynamics. One agent is then removed from the system - i.e. the system experiences node dropout - forming a new MDP of the remaining agents, with a new state space, action space, and new transition dynamics. We assume that the effect of removing an agent corresponds to the marginalization of its factor in the transition dynamics. The reward function may likewise be marginalized, or it may be entirely redefined for the new system. Robust policy importance sampling is then used to evaluate candidate policies for the new system, and estimated values are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models · Simulation Techniques and Applications
MethodsDropout
