Enhancing Phishing Detection in Financial Systems through NLP

Novruz Amirov; Leminur Celik; Egemen Ali Caner; Emre Yurdakul; Fahri Anil Yerlikaya; Serif Bahtiyar

arXiv:2507.04426·cs.CR·July 8, 2025

Enhancing Phishing Detection in Financial Systems through NLP

Novruz Amirov, Leminur Celik, Egemen Ali Caner, Emre Yurdakul, Fahri Anil Yerlikaya, Serif Bahtiyar

PDF

TL;DR

This paper introduces an NLP-based method combining TFIDF and semantic similarity to improve phishing email detection in financial systems, achieving up to 79.8% accuracy, addressing limitations of traditional blacklists and whitelists.

Contribution

It presents a novel NLP approach integrating TFIDF and semantic analysis for more effective phishing detection in financial cybersecurity.

Findings

01

Detection accuracy reaches 79.8% with TFIDF analysis.

02

Semantic analysis achieves 67.2% accuracy.

03

The method outperforms traditional blacklist/whitelist approaches.

Abstract

The threat of phishing attacks in financial systems is continuously growing. Therefore, protecting sensitive information from unauthorized access is paramount. This paper discusses the critical need for robust email phishing detection. Several existing methods, including blacklists and whitelists, play a crucial role in detecting phishing attempts. Nevertheless, these methods possess inherent limitations, emphasizing the need for the development of a more advanced solution. Our proposed solution presents a pioneering Natural Language Processing (NLP) approach for phishing email detection. Leveraging semantic similarity and TFIDF (Term Frequency-Inverse Document Frequency) analysis, our solution identifies keywords in phishing emails, subsequently evaluating the semantic similarities with a dedicated phishing dataset, ultimately contributing to the enhancement of cybersecurity and NLP…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.