A Wasserstein distance-based spectral clustering method for transaction data analysis
Yingqiu Zhu, Danyang Huang, Bo Zhang

TL;DR
This paper introduces a novel spectral clustering method based on Wasserstein distance for analyzing merchant transaction data, effectively capturing distributional differences without information loss, and scalable to large datasets.
Contribution
It proposes a Wasserstein-distance-based spectral clustering approach with a subsampling method for large-scale transaction data analysis, improving over traditional feature-based methods.
Findings
Outperforms feature-based methods in identifying merchant behavior patterns
Theoretical properties confirm the efficiency of the approach
Effective on large-scale datasets with limited computational resources
Abstract
With the rapid development of online payment platforms, it is now possible to record massive transaction data. Clustering on transaction data significantly contributes to analyzing merchants' behavior patterns. This enables payment platforms to provide differentiated services or implement risk management strategies. However, traditional methods exploit transactions by generating low-dimensional features, leading to inevitable information loss. In this study, we use the empirical cumulative distribution of transactions to characterize merchants. We adopt Wasserstein distance to measure the dissimilarity between any two merchants and propose the Wasserstein-distance-based spectral clustering (WSC) approach. Based on the similarities between merchants' transaction distributions, a graph of merchants is generated. Thus, we treat the clustering of merchants as a graph-cut problem and solve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques
