Prediction of motor insurance claims occurrence as an imbalanced machine   learning problem

Sebastian Baran; Przemys{\l}aw Rola

arXiv:2204.06109·q-fin.ST·April 14, 2022·6 cites

Prediction of motor insurance claims occurrence as an imbalanced machine learning problem

Sebastian Baran, Przemys{\l}aw Rola

PDF

Open Access

TL;DR

This paper investigates various machine learning techniques to predict car insurance claim occurrences, addressing the challenge of imbalanced datasets where claims are rare compared to the total driver population.

Contribution

It applies and compares multiple imbalance handling methods across different algorithms for claim prediction in insurance, providing insights into their effectiveness.

Findings

01

Imbalanced data significantly affects prediction accuracy.

02

Certain techniques improve model performance on rare claim events.

03

Comparison of algorithms highlights best practices for insurance claim prediction.

Abstract

The insurance industry, with its large datasets, is a natural place to use big data solutions. However it must be stressed, that significant number of applications for machine learning in insurance industry, like fraud detection or claim prediction, deals with the problem of machine learning on an imbalanced data set. This is due to the fact that frauds or claims are rare events when compared with the entire population of drivers. The problem of imbalanced learning is often hard to overcome. Therefore, the main goal of this work is to present and apply various methods of dealing with an imbalanced dataset in the context of claim occurrence prediction in car insurance. In addition, the above techniques are used to compare the results of machine learning algorithms in the context of claim occurrence prediction in car insurance. Our study covers the following techniques:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques