Partial Counterfactual Identification for Infinite Horizon Partially Observable Markov Decision Process
Aditya Kelvianto Sidharta

TL;DR
This paper extends counterfactual bounds to infinite-horizon POMDPs by modifying Q-learning, demonstrating improved performance over existing algorithms through simulations.
Contribution
It introduces a modified Q-learning algorithm for bounding counterfactuals in infinite-horizon causal diagrams, expanding beyond finite-horizon assumptions.
Findings
The modified Q-learning algorithm provides tighter bounds.
Simulations show improved performance over existing methods.
The approach effectively handles infinite-horizon causal structures.
Abstract
This paper investigates the problem of bounding possible output from a counterfactual query given a set of observational data. While various works of literature have described methodologies to generate efficient algorithms that provide an optimal bound for the counterfactual query, all of them assume a finite-horizon causal diagram. This paper aims to extend the previous work by modifying Q-learning algorithm to provide informative bounds of a causal query given an infinite-horizon causal diagram. Through simulations, our algorithms are proven to perform better compared to existing algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms · Bayesian Modeling and Causal Inference
MethodsQ-Learning
