A Generalized Bias-Variance Decomposition for Bregman Divergences
David Pfau

TL;DR
This paper generalizes the bias-variance decomposition to Bregman divergences, providing a clear derivation relevant for maximum likelihood estimation in exponential families, enhancing understanding of prediction error analysis.
Contribution
It offers a standalone, pedagogical derivation of the bias-variance decomposition for Bregman divergences, extending the classical squared error case.
Findings
Provides a clear derivation of the generalized bias-variance decomposition
Connects the decomposition to maximum likelihood estimation in exponential families
Enhances theoretical understanding of prediction errors in machine learning
Abstract
The bias-variance decomposition is a central result in statistics and machine learning, but is typically presented only for the squared error. We present a generalization of the bias-variance decomposition where the prediction error is a Bregman divergence, which is relevant to maximum likelihood estimation with exponential families. While the result is already known, there was not previously a clear, standalone derivation, so we provide one for pedagogical purposes. A version of this note previously appeared on the author's personal website without context. Here we provide additional discussion and references to the relevant prior literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
