Generating synthetic transactional profiles
Hadrien Lautraite, Patrick Mesana

TL;DR
This paper presents a machine learning approach to generate synthetic transactional profiles that maintain data utility for banking insights while enhancing privacy, addressing challenges like data sparsity.
Contribution
It introduces neural network models for creating synthetic transaction data that balance utility and privacy, with an analysis of privacy-preserving techniques.
Findings
Neural networks can generate valuable synthetic transactional data.
Synthetic data preserves key banking insights.
Privacy techniques impact model performance.
Abstract
Financial institutions use clients' payment transactions in numerous banking applications. Transactions are very personal and rich in behavioural patterns, often unique to individuals, which make them equivalent to personally identifiable information in some cases. In this paper, we generate synthetic transactional profiles using machine learning techniques with the goal to preserve both data utility and privacy. A challenge we faced was to deal with sparse vectors due to the few spending categories a client uses compared to all the ones available. We measured data utility by calculating common insights used by the banking industry on both the original and the synthetic data-set. Our approach shows that neural network models can generate valuable synthetic data in such context. Finally, we tried privacy-preserving techniques and observed its effect on models' performances.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Blockchain Technology Applications and Security · Imbalanced Data Classification Techniques
