Tweedie Gradient Boosting for Extremely Unbalanced Zero-inflated Data
He Zhou, Yi Yang, Wei Qian

TL;DR
This paper introduces EMTboost, a novel gradient boosting method for zero-inflated Tweedie models, effectively handling extremely unbalanced insurance claim data with complex predictor interactions.
Contribution
It develops a zero-inflated Tweedie boosting model with a nonparametric component and a specialized EM algorithm, improving modeling of zero-inflated, skewed data.
Findings
Enhanced prediction accuracy on synthetic auto-insurance data
Effectively captures nonlinearities and complex interactions
Outperforms traditional Tweedie models in zero-inflated scenarios
Abstract
Tweedie's compound Poisson model is a popular method to model insurance claims with probability mass at zero and nonnegative, highly right-skewed distribution. In particular, it is not uncommon to have extremely unbalanced data with excessively large proportion of zero claims, and even traditional Tweedie model may not be satisfactory for fitting the data. In this paper, we propose a boosting-assisted zero-inflated Tweedie model, called EMTboost, that allows zero probability mass to exceed a traditional model. We makes a nonparametric assumption on its Tweedie model component, that unlike a linear model, is able to capture nonlinearities, discontinuities, and complex higher order interactions among predictors. A specialized Expectation-Maximization algorithm is developed that integrates a blockwise coordinate descent strategy and a gradient tree-boosting algorithm to estimate key model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models · Statistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models
