Hybrid Tree-based Models for Insurance Claims
Zhiyu Quan, Zhiguo Wang, Guojun Gan, Emiliano A. Valdez

TL;DR
This paper introduces a hybrid tree-based modeling approach for insurance claims that improves prediction accuracy over traditional models by combining classification trees and elastic net regression, effectively handling zero-inflated data.
Contribution
The paper presents a novel hybrid two-step tree-based model that enhances prediction accuracy for insurance claims by addressing zero-inflation and allowing targeted hyperparameter tuning.
Findings
Hybrid models outperform traditional Tweedie models in prediction accuracy.
The approach maintains interpretability while improving performance.
Models are effective on both real and synthetic datasets.
Abstract
Two-part models and Tweedie generalized linear models (GLMs) have been used to model loss costs for short-term insurance contract. For most portfolios of insurance claims, there is typically a large proportion of zero claims that leads to imbalances resulting in inferior prediction accuracy of these traditional approaches. This article proposes the use of tree-based models with a hybrid structure that involves a two-step algorithm as an alternative approach to these traditional models. The first step is the construction of a classification tree to build the probability model for frequency. In the second step, we employ elastic net regression models at each terminal node from the classification tree to build the distribution model for severity. This hybrid structure captures the benefits of tuning hyperparameters at each step of the algorithm; this allows for improved prediction accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models · Insurance, Mortality, Demography, Risk Management · Statistical Methods and Bayesian Inference
