$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control
Michael Giegrich, Roel Oomen, Christoph Reisinger

TL;DR
This paper introduces a $K$-nearest neighbor resampling method for off-policy evaluation in stochastic control, providing statistical guarantees without assuming i.i.d. data, and demonstrating effectiveness across various complex environments.
Contribution
It generalizes Stone's Theorem for episodic data, enabling consistent off-policy evaluation without parametric models or optimization, suitable for stochastic control applications.
Findings
Method is statistically consistent under weak conditions.
Efficient implementation via tree-based nearest neighbor search.
Proven effectiveness in diverse stochastic control scenarios.
Abstract
In this paper, we propose a novel -nearest neighbor resampling procedure for estimating the performance of a policy from historical data containing realized episodes of a decision process generated under a different policy. We provide statistical consistency results under weak conditions. In particular, we avoid the common assumption of identically and independently distributed transitions and rewards. Instead, our analysis allows for the sampling of entire episodes, as is common practice in most applications. To establish the consistency in this setting, we generalize Stone's Theorem, a well-known result in nonparametric statistics on local averaging, to include episodic data and the counterfactual estimation underlying off-policy evaluation (OPE). By focusing on feedback policies that depend deterministically on the current state in environments with continuous state-action spaces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Energy, Environment, and Transportation Policies
MethodsFocus
