TypoSwype: An Imaging Approach to Detect Typo-Squatting

Joon Sern Lee; Yam Gui Peng David

arXiv:2209.00783·cs.CR·September 5, 2022

TypoSwype: An Imaging Approach to Detect Typo-Squatting

Joon Sern Lee, Yam Gui Peng David

PDF

TL;DR

TypoSwype introduces an imaging-based method using CNNs to detect typo-squatting domains by capturing keyboard layout errors, outperforming traditional string comparison algorithms like DLD.

Contribution

The paper presents a novel image-based approach with CNNs for typo-squatting detection that incorporates keyboard layout information, improving accuracy over existing string comparison methods.

Findings

01

Outperforms DLD in typo-squatting detection accuracy

02

Uses CNNs trained with Triplet Loss or NT-Xent Loss for better similarity mapping

03

Maintains domain classification accuracy while improving detection

Abstract

Typo-squatting domains are a common cyber-attack technique. It involves utilising domain names, that exploit possible typographical errors of commonly visited domains, to carry out malicious activities such as phishing, malware installation, etc. Current approaches typically revolve around string comparison algorithms like the Demaru-Levenschtein Distance (DLD) algorithm. Such techniques do not take into account keyboard distance, which researchers find to have a strong correlation with typical typographical errors and are trying to take account of. In this paper, we present the TypoSwype framework which converts strings to images that take into account keyboard location innately. We also show how modern state of the art image recognition techniques involving Convolutional Neural Networks, trained via either Triplet Loss or NT-Xent Loss, can be applied to learn a mapping to a lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTriplet Loss · Normalized Temperature-scaled Cross Entropy Loss