On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen, Zhengling Qi

TL;DR
This paper establishes the well-posedness and minimax optimal convergence rates for nonparametric Q-function estimation in off-policy evaluation, using a novel NPIV formulation that bypasses previous restrictions.
Contribution
It introduces a new well-posedness result for Q-function estimation via NPIV, derives minimax lower bounds, and proposes a rate-optimal sieve two-stage least squares estimator.
Findings
Well-posedness of Q-function estimation without strong discount factor assumptions.
Minimax lower bounds matching classical nonparametric regression rates.
A sieve two-stage least squares estimator achieving these rates.
Abstract
We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the -function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of -function estimation is well-posed in the sense of -measure of ill-posedness with respect to the data generating distribution, bypassing a strong assumption on the discount factor imposed in the recent literature for obtaining the convergence rates of various -function estimators. Thanks to this new well-posed property, we derive the first minimax lower bounds for the convergence rates of nonparametric estimation of -function and its derivatives in both sup-norm and -norm, which are shown to be the same as those for the classical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Causal Inference Techniques · Health Systems, Economic Evaluations, Quality of Life · Statistical Methods and Inference
