PhreshPhish: A Real-World, High-Quality, Large-Scale Phishing Website Dataset and Benchmark

Thomas Dalton; Hemanth Gowda; Girish Rao; Sachin Pargi; Alireza Hadj Khodabakhshi; Joseph Rombs; Stephan Jou; Manish Marwah

arXiv:2507.10854·cs.CR·February 13, 2026

PhreshPhish: A Real-World, High-Quality, Large-Scale Phishing Website Dataset and Benchmark

Thomas Dalton, Hemanth Gowda, Girish Rao, Sachin Pargi, Alireza Hadj Khodabakhshi, Joseph Rombs, Stephan Jou, Manish Marwah

PDF

Open Access 5 Datasets

TL;DR

This paper introduces PhreshPhish, a large, high-quality dataset and benchmark suite for phishing website detection, addressing limitations of existing datasets to enable more realistic and standardized evaluation of detection models.

Contribution

The paper presents a new large-scale, high-quality phishing dataset and a comprehensive benchmark suite designed for realistic model evaluation, reducing data leakage and increasing task difficulty.

Findings

01

Baseline models evaluated on the new benchmarks.

02

Significant improvements over existing datasets in data quality.

03

Benchmark results highlight challenges in phishing detection.

Abstract

Phishing remains a pervasive and growing threat, inflicting heavy economic and reputational damage. While machine learning has been effective in real-time detection of phishing attacks, progress is hindered by lack of large, high-quality datasets and benchmarks. In addition to poor-quality due to challenges in data collection, existing datasets suffer from leakage and unrealistic base rates, leading to overly optimistic performance results. In this paper, we introduce PhreshPhish, a large-scale, high-quality dataset of phishing websites that addresses these limitations. Compared to existing public datasets, PhreshPhish is substantially larger and provides significantly higher quality, as measured by the estimated rate of invalid or mislabeled data points. Additionally, we propose a comprehensive suite of benchmark datasets specifically designed for realistic model evaluation by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Blood donation and transfusion practices