Post Reinforcement Learning Inference

Vasilis Syrgkanis; Ruohan Zhan

arXiv:2302.08854·stat.ML·October 6, 2025·1 cites

Post Reinforcement Learning Inference

Vasilis Syrgkanis, Ruohan Zhan

PDF

Open Access 1 Repo

TL;DR

This paper develops a new weighted GMM method for valid inference of structural parameters in reinforcement learning data, addressing challenges from adaptive data collection and nonstationary policies.

Contribution

It introduces a weighted GMM approach with adaptive weights to achieve asymptotic normality in RL inference, a significant advancement over standard estimators.

Findings

01

Proposes a weighted GMM estimator for RL data.

02

Ensures asymptotic normality under adaptive data collection.

03

Enables valid hypothesis testing and confidence intervals.

Abstract

We study estimation and inference using data collected by reinforcement learning (RL) algorithms. These algorithms adaptively experiment by interacting with individual units over multiple stages, updating their strategies based on past outcomes. Our goal is to evaluate a counterfactual policy after data collection and estimate structural parameters, such as dynamic treatment effects, that support credit assignment and quantify the impact of early actions on final outcomes. These parameters can often be defined as solutions to moment equations, motivating moment-based estimation methods developed for static data. In RL settings, however, data are often collected adaptively under nonstationary behavior policies. As a result, standard estimators fail to achieve asymptotic normality due to time-varying variance. We propose a weighted generalized method of moments (GMM) approach that uses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ruohanzhan/rl_inference
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Causal Inference Techniques · Statistical Methods in Clinical Trials

Methodsfail