On the Convergence of Multicalibration Gradient Boosting
Daniel Haimovich, Fridolin Linder, Lorenzo Perini, Niek Tax, Milan Vojnovic

TL;DR
This paper provides theoretical convergence guarantees for multicalibration gradient boosting in regression, showing decay rates of prediction errors and conditions for linear and quadratic convergence, supported by empirical experiments.
Contribution
It offers the first convergence analysis for multicalibration gradient boosting, including decay rates, smoothness-based improvements, and adaptive variant analysis.
Findings
Prediction updates decay at O(1/√T) rate.
Under smoothness, convergence becomes linear.
Experiments confirm theoretical convergence regimes.
Abstract
Multicalibration gradient boosting has recently emerged as a scalable method that empirically produces approximately multicalibrated predictors and has been deployed at web scale. Despite this empirical success, its convergence properties are not well understood. In this paper, we bridge the gap by providing convergence guarantees for multicalibration gradient boosting in regression with squared-error loss. We show that the magnitude of successive prediction updates decays at , which implies the same convergence rate bound for the multicalibration error over rounds. Under additional smoothness assumptions on the weak learners, this rate improves to linear convergence. We further analyze adaptive variants, showing local quadratic convergence of the training loss, and we study rescaling schemes that preserve convergence. Experiments on real-world datasets support our theory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
