Normalisation of SWIFT Message Counterparties with Feature Extraction and Clustering
Thanasis Schoinas, Benjamin Guinard, Diba Esbati, Richard Chalk

TL;DR
This paper introduces a hybrid clustering approach combining string similarity, topic modeling, and hierarchical clustering to improve the grouping of SWIFT message counterparties, aiding fraud detection and investigation.
Contribution
It presents a novel hybrid pipeline tailored for clustering bank transaction counterparties, addressing the limitations of natural language models and manual fuzzy matching techniques.
Findings
Significantly outperforms baseline keyword methods
Reduces manual review workload
Enhances detection of entity variations in sanctions investigations
Abstract
Short text clustering is a known use case in the text analytics community. When the structure and content falls in the natural language domain e.g. Twitter posts or instant messages, then natural language techniques can be used, provided texts are of sufficient length to allow for use of (pre)trained models to extract meaningful information, such as part-of-speech or topic annotations. However, natural language models are not suitable for clustering transaction counterparties, as they are found in bank payment messaging systems, such as SWIFT. The manually typed tags are typically physical or legal entity details, which lack sentence structure, while containing all the variations and noise that manual entry introduces. This leaves a gap in an investigator or counter-fraud professional's toolset when looking to augment their knowledge of payment flow originator and beneficiary entities…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIPv6, Mobility, Handover, Networks, Security · Network Packet Processing and Optimization
