A Customer Level Fraudulent Activity Detection Benchmark for Enhancing   Machine Learning Model Research and Evaluation

Phoebe Jing; Yijing Gao; Xianlong Zeng

arXiv:2404.14746·cs.LG·April 24, 2024

A Customer Level Fraudulent Activity Detection Benchmark for Enhancing Machine Learning Model Research and Evaluation

Phoebe Jing, Yijing Gao, Xianlong Zeng

PDF

Open Access

TL;DR

This paper introduces a privacy-compliant customer-level fraud detection benchmark dataset to improve machine learning research and evaluation in detecting sophisticated fraud schemes.

Contribution

It provides a structured, privacy-aware dataset for customer-level fraud detection, enabling more comprehensive model evaluation and development.

Findings

01

Benchmark facilitates evaluation of diverse machine learning models.

02

Customer-level features improve fraud detection accuracy.

03

Dataset supports privacy-preserving research in fraud detection.

Abstract

In the field of fraud detection, the availability of comprehensive and privacy-compliant datasets is crucial for advancing machine learning research and developing effective anti-fraud systems. Traditional datasets often focus on transaction-level information, which, while useful, overlooks the broader context of customer behavior patterns that are essential for detecting sophisticated fraud schemes. The scarcity of such data, primarily due to privacy concerns, significantly hampers the development and testing of predictive models that can operate effectively at the customer level. Addressing this gap, our study introduces a benchmark that contains structured datasets specifically designed for customer-level fraud detection. The benchmark not only adheres to strict privacy guidelines to ensure user confidentiality but also provides a rich source of information by encapsulating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques

MethodsFocus