SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection   with Spark

Fabrizio Carcillo; Andrea Dal Pozzolo; Yann-A\"el Le Borgne; Olivier; Caelen; Yannis Mazzer; Gianluca Bontempi

arXiv:1709.08920·cs.DC·September 27, 2017

SCARFF: a Scalable Framework for Streaming Credit Card Fraud Detection with Spark

Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-A\"el Le Borgne, Olivier, Caelen, Yannis Mazzer, Gianluca Bontempi

PDF

TL;DR

This paper introduces SCARFF, a scalable real-time framework combining Big Data tools and machine learning to detect credit card fraud efficiently and accurately in streaming data environments.

Contribution

The paper presents a novel scalable framework integrating Kafka, Spark, and Cassandra with machine learning for real-time fraud detection on massive data streams.

Findings

01

Framework is scalable and handles large transaction streams.

02

Achieves high accuracy in fraud detection.

03

Effective in dealing with data imbalance and nonstationarity.

Abstract

The expansion of the electronic commerce, together with an increasing confidence of customers in electronic payments, makes of fraud detection a critical factor. Detecting frauds in (nearly) real time setting demands the design and the implementation of scalable learning techniques able to ingest and analyse massive amounts of streaming data. Recent advances in analytics and the availability of open source solutions for Big Data storage and processing open new perspectives to the fraud detection field. In this paper we present a SCAlable Real-time Fraud Finder (SCARFF) which integrates Big Data tools (Kafka, Spark and Cassandra) with a machine learning approach which deals with imbalance, nonstationarity and feedback latency. Experimental results on a massive dataset of real credit card transactions show that this framework is scalable, efficient and accurate over a big stream of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.