Discretization-free Multicalibration through Loss Minimization over Tree Ensembles

Hongyi Henry Jin; Zijun Ding; Dung Daniel Ngo; Zhiwei Steven Wu

arXiv:2505.17435·cs.LG·May 26, 2025

Discretization-free Multicalibration through Loss Minimization over Tree Ensembles

Hongyi Henry Jin, Zijun Ding, Dung Daniel Ngo, Zhiwei Steven Wu

PDF

TL;DR

This paper introduces a novel discretization-free multicalibration method that directly optimizes risk over tree ensembles, avoiding discretization errors and hyperparameters, and demonstrates superior empirical performance.

Contribution

The authors propose a new ERM-based multicalibration approach using decision trees, eliminating the need for output discretization and hyperparameter tuning, with theoretical guarantees and practical effectiveness.

Findings

01

Method achieves multicalibration under loss saturation condition.

02

Empirical results show consistent outperforming of existing methods.

03

Approach integrates with standard tree ensemble algorithms like LightGBM.

Abstract

In recent years, multicalibration has emerged as a desirable learning objective for ensuring that a predictor is calibrated across a rich collection of overlapping subpopulations. Existing approaches typically achieve multicalibration by discretizing the predictor's output space and iteratively adjusting its output values. However, this discretization approach departs from the standard empirical risk minimization (ERM) pipeline, introduces rounding error and additional sensitive hyperparameter, and may distort the predictor's outputs in ways that hinder downstream decision-making. In this work, we propose a discretization-free multicalibration method that directly optimizes an empirical risk objective over an ensemble of depth-two decision trees. Our ERM approach can be implemented using off-the-shelf tree ensemble learning methods such as LightGBM. Our algorithm provably achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.