Normalisation of SWIFT Message Counterparties with Feature Extraction and Clustering

Thanasis Schoinas; Benjamin Guinard; Diba Esbati; Richard Chalk

arXiv:2508.21081·cs.LG·September 1, 2025

Normalisation of SWIFT Message Counterparties with Feature Extraction and Clustering

Thanasis Schoinas, Benjamin Guinard, Diba Esbati, Richard Chalk

PDF

Open Access

TL;DR

This paper introduces a hybrid clustering approach combining string similarity, topic modeling, and hierarchical clustering to improve the grouping of SWIFT message counterparties, aiding fraud detection and investigation.

Contribution

It presents a novel hybrid pipeline tailored for clustering bank transaction counterparties, addressing the limitations of natural language models and manual fuzzy matching techniques.

Findings

01

Significantly outperforms baseline keyword methods

02

Reduces manual review workload

03

Enhances detection of entity variations in sanctions investigations

Abstract

Short text clustering is a known use case in the text analytics community. When the structure and content falls in the natural language domain e.g. Twitter posts or instant messages, then natural language techniques can be used, provided texts are of sufficient length to allow for use of (pre)trained models to extract meaningful information, such as part-of-speech or topic annotations. However, natural language models are not suitable for clustering transaction counterparties, as they are found in bank payment messaging systems, such as SWIFT. The manually typed tags are typically physical or legal entity details, which lack sentence structure, while containing all the variations and noise that manual entry introduces. This leaves a gap in an investigator or counter-fraud professional's toolset when looking to augment their knowledge of payment flow originator and beneficiary entities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIPv6, Mobility, Handover, Networks, Security · Network Packet Processing and Optimization