Cost-sensitive Semi-supervised Classification for Fraud Applications
Sulaf Elshaar, Samira Sadaoui

TL;DR
This paper introduces a novel integration of Cost-Sensitive Learning with Semi-Supervised Classification to improve fraud detection, especially for shill bidding, by effectively handling imbalanced data and misclassification costs.
Contribution
It is the first to combine CSL with SSC specifically for fraud detection, demonstrating improved accuracy and cost efficiency in identifying fraudulent bidders.
Findings
Achieved 99% fraud detection rate
Reduced total misclassification cost
Validated with real shill bidding dataset
Abstract
This research explores Cost-Sensitive Learning (CSL) in the fraud detection domain to decrease the fraud class's incorrect predictions and increase its accuracy. Notably, we concentrate on shill bidding fraud that is challenging to detect because the behavior of shill and legitimate bidders are similar. We investigate CSL within the Semi-Supervised Classification (SSC) framework to address the scarcity of labeled fraud data. Our paper is the first attempt to integrate CSL with SSC for fraud detection. We adopt a meta-CSL approach to manage the costs of misclassification errors, while SSC algorithms are trained with imbalanced data. Using an actual shill bidding dataset, we assess the performance of several hybrid models of CSL and SSC and then compare their misclassification error and accuracy rates statistically. The most efficient CSL+SSC model was able to detect 99% of fraudsters and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsCircular Smooth Label
