From Point to probabilistic gradient boosting for claim frequency and severity prediction
Dominik Chevalier, Marie-Pier C\^ot\'e

TL;DR
This paper compares various point and probabilistic gradient boosting algorithms for claim frequency and severity prediction, highlighting their performance, efficiency, and interpretability in actuarial datasets.
Contribution
It provides a unified comparison of existing gradient boosting methods, including new insights on handling exposure-to-risk and model interpretability in actuarial applications.
Findings
LightGBM and XGBoostLSS are most computationally efficient.
CatBoost performs well with high cardinality categorical variables.
EGBM offers competitive interpretability without sacrificing accuracy.
Abstract
Gradient boosting for decision tree algorithms are increasingly used in actuarial applications as they show superior predictive performance over traditional generalised linear models. Many enhancements to the first gradient boosting machine algorithm exist. We present in a unified notation, and contrast, all the existing point and probabilistic gradient boosting for decision tree algorithms: GBM, XGBoost, DART, LightGBM, CatBoost, EGBM, PGBM, XGBoostLSS, cyclic GBM, and NGBoost. In this comprehensive numerical study, we compare their performance on five publicly available datasets for claim frequency and severity, of various sizes and comprising different numbers of (high cardinality) categorical variables. We explain how varying exposure-to-risk can be handled with boosting in frequency models. We compare the algorithms on the basis of computational efficiency, predictive performance,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models
MethodsDifficulty-Aware Rejection Tuning
