Zero-Inflated Tweedie Boosted Trees with CatBoost for Insurance Loss Analytics
Banghee So, Emiliano A. Valdez

TL;DR
This paper introduces a novel zero-inflated Tweedie regression model using CatBoost boosting to improve insurance claim predictions, especially for datasets with many zero claims, demonstrating significant performance gains.
Contribution
The paper's main contribution is the development of a zero-inflated Tweedie model integrated with CatBoost boosting, specifically tailored for insurance claim data with many zeros.
Findings
Enhanced predictive accuracy on insurance telematics data
Effective handling of zero-inflation in claim modeling
Improved model performance over traditional Tweedie models
Abstract
In this paper, we explore advanced modifications to the Tweedie regression model in order to address its limitations in modeling aggregate claims for various types of insurance such as automobile, health, and liability. Traditional Tweedie models, while effective in capturing the probability and magnitude of claims, usually fall short in accurately representing the large incidence of zero claims. Our recommended approach involves a refined modeling of the zero-claim process, together with the integration of boosting methods in order to help leverage an iterative process to enhance predictive accuracy. Despite the inherent slowdown in learning algorithms due to this iteration, several efficient implementation techniques that also help precise tuning of parameters like XGBoost, LightGBM, and CatBoost have emerged. Nonetheless, we chose to utilize CatBoost, an efficient boosting approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
