BadLink: Combining Graph and Information-Theoretical Features for Online Fraud Group Detection
Yikun Ban, Xin Liu, Tianyi Zhang, Ling Huang, Yitao Duan, Xue Liu, Wei, Xu

TL;DR
BadLink is a scalable fraud detection framework that combines graph and information-theoretical features to identify online fraud groups effectively, outperforming existing solutions even against camouflaged traffic.
Contribution
It introduces a novel combination of graph and information-theoretical features into a scalable, extensible framework for online fraud group detection.
Findings
Achieves state-of-the-art detection accuracy
Effective against sophisticated camouflage traffic
Supports multimodal datasets with diverse data types
Abstract
Frauds severely hurt many kinds of Internet businesses. Group-based fraud detection is a popular methodology to catch fraudsters who unavoidably exhibit synchronized behaviors. We combine both graph-based features (e.g. cluster density) and information-theoretical features (e.g. probability for the similarity) of fraud groups into two intuitive metrics. Based on these metrics, we build an extensible fraud detection framework, BadLink, to support multimodal datasets with different data types and distributions in a scalable way. Experiments on real production workload, as well as extensive comparison with existing solutions demonstrate the state-of-the-art performance of BadLink, even with sophisticated camouflage traffic.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Data Stream Mining Techniques · Imbalanced Data Classification Techniques
