Krylov-Bellman boosting: Super-linear policy evaluation in general state   spaces

Eric Xia; Martin J. Wainwright

arXiv:2210.11377·stat.ML·March 28, 2023·1 cites

Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces

Eric Xia, Martin J. Wainwright

PDF

Open Access

TL;DR

The paper introduces Krylov-Bellman Boosting (KBB), an algorithm for policy evaluation in complex state spaces that combines residual fitting and value estimation, achieving super-linear convergence and reduced sample complexity.

Contribution

The paper proposes the KBB algorithm, integrating Krylov methods with boosting and LSTD, providing new convergence guarantees and improved sample efficiency for policy evaluation.

Findings

01

KBB achieves super-linear convergence rates.

02

KBB reduces sample complexity compared to standard methods.

03

Numerical experiments validate theoretical guarantees.

Abstract

We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value function via the least-squares temporal difference (LSTD) procedure applied with a feature set that grows adaptively over time. By exploiting the connection to Krylov methods, we equip this method with two attractive guarantees. First, we provide a general convergence bound that allows for separate estimation errors in residual fitting and LSTD computation. Consistent with our numerical experiments, this bound shows that convergence rates depend on the restricted spectral structure, and are typically super-linear. Second, by combining this meta-result with sample-size dependent guarantees for residual fitting and LSTD computation, we obtain concrete…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNuclear reactor physics and engineering · Probabilistic and Robust Engineering Design · Bayesian Modeling and Causal Inference