Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces
Eric Xia, Martin J. Wainwright

TL;DR
The paper introduces Krylov-Bellman Boosting (KBB), an algorithm for policy evaluation in complex state spaces that combines residual fitting and value estimation, achieving super-linear convergence and reduced sample complexity.
Contribution
The paper proposes the KBB algorithm, integrating Krylov methods with boosting and LSTD, providing new convergence guarantees and improved sample efficiency for policy evaluation.
Findings
KBB achieves super-linear convergence rates.
KBB reduces sample complexity compared to standard methods.
Numerical experiments validate theoretical guarantees.
Abstract
We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value function via the least-squares temporal difference (LSTD) procedure applied with a feature set that grows adaptively over time. By exploiting the connection to Krylov methods, we equip this method with two attractive guarantees. First, we provide a general convergence bound that allows for separate estimation errors in residual fitting and LSTD computation. Consistent with our numerical experiments, this bound shows that convergence rates depend on the restricted spectral structure, and are typically super-linear. Second, by combining this meta-result with sample-size dependent guarantees for residual fitting and LSTD computation, we obtain concrete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear reactor physics and engineering · Probabilistic and Robust Engineering Design · Bayesian Modeling and Causal Inference
