Generative Adversarial Networks for Bitcoin Data Augmentation

Francesco Zola; Jan Lukas Bruse; Xabier Etxeberria Barrio; Mikel; Galar; Raul Orduna Urrutia

arXiv:2005.13369·cs.LG·May 28, 2020

Generative Adversarial Networks for Bitcoin Data Augmentation

Francesco Zola, Jan Lukas Bruse, Xabier Etxeberria Barrio, Mikel, Galar, Raul Orduna Urrutia

PDF

TL;DR

This paper explores using GANs to generate synthetic Bitcoin address data to address class imbalance in entity classification, demonstrating how proper GAN configuration improves data quality for underrepresented classes.

Contribution

First study to apply GANs for synthetic Bitcoin address data augmentation to improve classification of underrepresented entity classes.

Findings

01

GAN parameters significantly influence synthetic data quality

02

Proper configuration yields high similarity between real and generated data

03

Synthetic data improves classification performance for minority classes

Abstract

In Bitcoin entity classification, results are strongly conditioned by the ground-truth dataset, especially when applying supervised machine learning approaches. However, these ground-truth datasets are frequently affected by significant class imbalance as generally they contain much more information regarding legal services (Exchange, Gambling), than regarding services that may be related to illicit activities (Mixer, Service). Class imbalance increases the complexity of applying machine learning techniques and reduces the quality of classification results, especially for underrepresented, but critical classes. In this paper, we propose to address this problem by using Generative Adversarial Networks (GANs) for Bitcoin data augmentation as GANs recently have shown promising results in the domain of image classification. However, there is no "one-fits-all" GAN solution that works for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.