To Ensemble or Not: Assessing Majority Voting Strategies for Phishing Detection with Large Language Models
Fouad Trad, Ali Chehab

TL;DR
This paper evaluates three ensemble voting strategies to improve phishing URL detection using large language models, highlighting their effectiveness depending on the performance consistency of individual models.
Contribution
It introduces and compares prompt-based, model-based, and hybrid ensemble strategies for LLMs in phishing detection, providing insights into their relative performance.
Findings
Ensemble strategies work best when individual models perform similarly.
When performance varies significantly, ensembles may not outperform the best single model.
Choosing the right ensemble depends on the consistency of individual model performance.
Abstract
The effectiveness of Large Language Models (LLMs) significantly relies on the quality of the prompts they receive. However, even when processing identical prompts, LLMs can yield varying outcomes due to differences in their training processes. To leverage the collective intelligence of multiple LLMs and enhance their performance, this study investigates three majority voting strategies for text classification, focusing on phishing URL detection. The strategies are: (1) a prompt-based ensemble, which utilizes majority voting across the responses generated by a single LLM to various prompts; (2) a model-based ensemble, which entails aggregating responses from multiple LLMs to a single prompt; and (3) a hybrid ensemble, which combines the two methods by sending different prompts to multiple LLMs and then aggregating their responses. Our analysis shows that ensemble strategies are most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Misinformation and Its Impacts
MethodsUmbrella Reinforcement Learning
