An engine to simulate insurance fraud network data

Bavo D.C. Campo; Katrien Antonio

arXiv:2308.11659·cs.LG·October 8, 2024

An engine to simulate insurance fraud network data

Bavo D.C. Campo, Katrien Antonio

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simulation engine that generates synthetic insurance fraud network data, addressing data scarcity and class imbalance issues to facilitate the development and testing of fraud detection models.

Contribution

The authors develop a customizable simulation tool that creates realistic synthetic insurance fraud data with controllable parameters for research and model validation.

Findings

01

Enables generation of large, realistic synthetic datasets

02

Facilitates testing of fraud detection methods under various scenarios

03

Addresses data scarcity and class imbalance challenges

Abstract

Traditionally, the detection of fraudulent insurance claims relies on business rules and expert judgement which makes it a time-consuming and expensive process (\'Oskarsd\'ottir et al., 2022). Consequently, researchers have been examining ways to develop efficient and accurate analytic strategies to flag suspicious claims. Feeding learning methods with features engineered from the social network of parties involved in a claim is a particularly promising strategy (see for example Van Vlasselaer et al. (2016); Tumminello et al. (2023)). When developing a fraud detection model, however, we are confronted with several challenges. The uncommon nature of fraud, for example, creates a high class imbalance which complicates the development of well performing analytic classification models. In addition, only a small number of claims are investigated and get a label, which results in a large…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

BavoDC/iFraudSimulator
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Artificial Intelligence in Law · Machine Learning in Healthcare