A Customer Level Fraudulent Activity Detection Benchmark for Enhancing Machine Learning Model Research and Evaluation
Phoebe Jing, Yijing Gao, Xianlong Zeng

TL;DR
This paper introduces a privacy-compliant customer-level fraud detection benchmark dataset to improve machine learning research and evaluation in detecting sophisticated fraud schemes.
Contribution
It provides a structured, privacy-aware dataset for customer-level fraud detection, enabling more comprehensive model evaluation and development.
Findings
Benchmark facilitates evaluation of diverse machine learning models.
Customer-level features improve fraud detection accuracy.
Dataset supports privacy-preserving research in fraud detection.
Abstract
In the field of fraud detection, the availability of comprehensive and privacy-compliant datasets is crucial for advancing machine learning research and developing effective anti-fraud systems. Traditional datasets often focus on transaction-level information, which, while useful, overlooks the broader context of customer behavior patterns that are essential for detecting sophisticated fraud schemes. The scarcity of such data, primarily due to privacy concerns, significantly hampers the development and testing of predictive models that can operate effectively at the customer level. Addressing this gap, our study introduces a benchmark that contains structured datasets specifically designed for customer-level fraud detection. The benchmark not only adheres to strict privacy guidelines to ensure user confidentiality but also provides a rich source of information by encapsulating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques
MethodsFocus
