PhishSnap: Image-Based Phishing Detection Using Perceptual Hashing
Md Abdul Ahad Minhaz, Zannatul Zahan Meem, Md. Shohrab Hossain

TL;DR
PhishSnap is a privacy-preserving browser extension that detects phishing websites by analyzing visual similarity through perceptual hashing of webpage screenshots, achieving high accuracy on a new dataset.
Contribution
This work introduces PhishSnap, a novel on-device phishing detection system using perceptual hashing to identify visually similar malicious pages without compromising user privacy.
Findings
Achieved 0.79 accuracy in phishing detection
Demonstrated effectiveness of visual similarity for anti-phishing
Built a new dataset of 10,000 URLs for evaluation
Abstract
Phishing remains one of the most prevalent online threats, exploiting human trust to harvest sensitive credentials. Existing URL- and HTML-based detection systems struggle against obfuscation and visual deception. This paper presents \textbf{PhishSnap}, a privacy-preserving, on-device phishing detection system leveraging perceptual hashing (pHash). Implemented as a browser extension, PhishSnap captures webpage screenshots, computes visual hashes, and compares them against legitimate templates to identify visually similar phishing attempts. A \textbf{2024 dataset of 10,000 URLs} (70\%/20\%/10\% train/validation/test) was collected from PhishTank and Netcraft. Due to security takedowns, a subset of phishing pages was unavailable, reducing dataset diversity. The system achieved \textbf{0.79 accuracy}, \textbf{0.76 precision}, and \textbf{0.78 recall}, showing that visual similarity remains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · User Authentication and Security Systems
