Delphic Offline Reinforcement Learning under Nonidentifiable Hidden   Confounding

Aliz\'ee Pace; Hugo Y\`eche; Bernhard Sch\"olkopf; Gunnar R\"atsch,; Guy Tennenholtz

arXiv:2306.01157·cs.LG·June 5, 2023·2 cites

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding

Aliz\'ee Pace, Hugo Y\`eche, Bernhard Sch\"olkopf, Gunnar R\"atsch,, Guy Tennenholtz

PDF

Open Access 1 Video

TL;DR

This paper addresses hidden confounding in offline reinforcement learning by defining delphic uncertainty, proposing a method to estimate it, and developing a pessimistic algorithm that improves decision-making despite unobserved confounders.

Contribution

It introduces the concept of delphic uncertainty for nonidentifiable hidden confounding and develops a practical estimation method and a robust offline RL algorithm that mitigates confounding bias.

Findings

01

Effective in reducing confounding bias in experiments

02

Improves offline RL performance on health-related benchmarks

03

Demonstrates robustness to unobserved confounders

Abstract

A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes. Hidden confounding can compromise the validity of any causal conclusion drawn from data and presents a major obstacle to effective offline RL. In the present paper, we tackle the problem of hidden confounding in the nonidentifiable setting. We propose a definition of uncertainty due to hidden confounding bias, termed delphic uncertainty, which uses variation over world models compatible with the observations, and differentiate it from the well-known epistemic and aleatoric uncertainties. We derive a practical method for estimating the three types of uncertainties, and construct a pessimistic offline RL algorithm to account for them. Our method does not assume identifiability of the unobserved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding· slideslive

Taxonomy

TopicsSepsis Diagnosis and Treatment · Hemodynamic Monitoring and Therapy · Advanced Causal Inference Techniques