Sound-skwatter (Did You Mean: Sound-squatter?) AI-powered Generator for Phishing Prevention
Rodolfo Valentim, Idilio Drago, Marco Mellia, Federico, Cerutti

TL;DR
Sound-skwatter is an AI-powered multi-language system that proactively generates sound-squatting candidates, including cross-language variants, to enhance phishing defense by identifying malicious domains and package names before they are exploited.
Contribution
It introduces a novel multi-modal Transformer-based approach for automatic sound-squatting candidate generation across multiple languages, including cross-language scenarios.
Findings
Approximately 10% of generated domains exist in the wild, mostly unknown to existing solutions.
Around 17% of popular PyPI packages have at least one sound-squatting candidate.
The system effectively covers known homophones and high-quality sound-squatting candidates across languages.
Abstract
Sound-squatting is a phishing attack that tricks users into malicious resources by exploiting similarities in the pronunciation of words. Proactive defense against sound-squatting candidates is complex, and existing solutions rely on manually curated lists of homophones. We here introduce Sound-skwatter, a multi-language AI-based system that generates sound-squatting candidates for proactive defense. Sound-skwatter relies on an innovative multi-modal combination of Transformers Networks and acoustic models to learn sound similarities. We show that Sound-skwatter can automatically list known homophones and thousands of high-quality candidates. In addition, it covers cross-language sound-squatting, i.e., when the reader and the listener speak different languages, supporting any combination of languages. We apply Sound-skwatter to network-centric phishing via squatted domain names. We find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Topic Modeling · Hate Speech and Cyberbullying Detection
