A Principled Path to Fitted Distributional Evaluation

Sungee Hong; Jiayi Wang; Zhengling Qi; Raymond K. W. Wong

arXiv:2506.20048·stat.ML·October 21, 2025

A Principled Path to Fitted Distributional Evaluation

Sungee Hong, Jiayi Wang, Zhengling Qi, Raymond K. W. Wong

PDF

Open Access 1 Video

TL;DR

This paper extends fitted Q-evaluation to distributional off-policy evaluation in reinforcement learning, providing a unified framework, new methods, and theoretical analysis, with empirical validation on diverse environments.

Contribution

It introduces fitted distributional evaluation (FDE), a principled framework for distributional OPE, along with new methods and convergence guarantees.

Findings

01

FDE methods outperform existing approaches in experiments.

02

Theoretical convergence guarantees are established for FDE.

03

FDE demonstrates superior performance in Atari and LQR environments.

Abstract

In reinforcement learning, distributional off-policy evaluation (OPE) focuses on estimating the return distribution of a target policy using offline data collected under a different policy. This work focuses on extending the widely used fitted Q-evaluation -- developed for expectation-based reinforcement learning -- to the distributional OPE setting. We refer to this extension as fitted distributional evaluation (FDE). While only a few related approaches exist, there remains no unified framework for designing FDE methods. To fill this gap, we present a set of guiding principles for constructing theoretically grounded FDE methods. Building on these principles, we develop several new FDE methods with convergence analysis and provide theoretical justification for existing methods, even in non-tabular environments. Extensive experiments, including simulations on linear quadratic regulators…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Principled Path to Fitted Distributional Evaluation· slideslive

Taxonomy

TopicsEvaluation and Performance Assessment

MethodsSparse Evolutionary Training