BABD: A Bitcoin Address Behavior Dataset for Pattern Analysis
Yuexin Xiang, Yuchen Lei, Ding Bao, Wei Ren, Tiantian Li, Qingqing, Yang, Wenmao Liu, Tianqing Zhu, and Kim-Kwang Raymond Choo

TL;DR
This paper introduces BABD-13, the largest labeled Bitcoin address behavior dataset, and demonstrates its effectiveness for machine learning-based pattern analysis and address classification.
Contribution
The paper presents a comprehensive, labeled Bitcoin address dataset and a novel k-hop subgraph extraction method for analyzing address behavior patterns.
Findings
Machine learning models achieve over 93% accuracy on address classification
The dataset contains 544,462 labeled data points across 13 address types
Proposed features and subgraph extraction enhance pattern analysis
Abstract
Cryptocurrencies are no longer just the preferred option for cybercriminal activities on darknets, due to the increasing adoption in mainstream applications. This is partly due to the transparency associated with the underpinning ledgers, where any individual can access the record of a transaction record on the public ledger. In this paper, we build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We then use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost. The results show that the accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Cybercrime and Law Enforcement Studies · Blockchain Technology Applications and Security
