Detecting Homoglyph Attacks with a Siamese Neural Network

Jonathan Woodbridge; Hyrum S. Anderson; Anjum Ahuja; Daniel Grant

arXiv:1805.09738·cs.CR·May 25, 2018

Detecting Homoglyph Attacks with a Siamese Neural Network

Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, Daniel Grant

PDF

2 Repos

TL;DR

This paper introduces a novel approach using a Siamese CNN to detect homoglyph attacks by learning visual similarity of rendered strings, significantly improving detection accuracy and speed over traditional methods.

Contribution

The study presents a new image-based, deep learning method for homoglyph detection that outperforms existing string comparison algorithms and provides publicly available datasets and code.

Findings

01

13% to 45% improvement in ROC AUC over baselines

02

Fast similarity search using KD-Trees

03

Effective detection of visually similar homoglyphs

Abstract

A homoglyph (name spoofing) attack is a common technique used by adversaries to obfuscate file and domain names. This technique creates process or domain names that are visually similar to legitimate and recognized names. For instance, an attacker may create malware with the name svch0st.exe so that in a visual inspection of running processes or a directory listing, the process or file name might be mistaken as the Windows system process svchost.exe. There has been limited published research on detecting homoglyph attacks. Current approaches rely on string comparison algorithms (such as Levenshtein distance) that result in computationally heavy solutions with a high number of false positives. In addition, there is a deficiency in the number of publicly available datasets for reproducible research, with most datasets focused on phishing attacks, in which homoglyphs are not always used.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.