TSM: Measuring the Enticement of Honeyfiles with Natural Language Processing
Roelien C. Timmer, David Liebowitz, Surya Nepal, Salil, Kanhere

TL;DR
This paper introduces TSM, a novel NLP-based metric for measuring honeyfile enticement by comparing topical content and semantic similarity, aiding cyber deception strategies.
Contribution
The paper presents TSM, the first NLP-based metric for quantifying honeyfile enticement through topic modeling and semantic matching, validated with a new honeyfile corpus.
Findings
TSM effectively compares honeyfiles across different corpora.
TSM is robust to paraphrasing and captures topical similarity.
Experiments demonstrate TSM's potential in cyber deception.
Abstract
Honeyfile deployment is a useful breach detection method in cyber deception that can also inform defenders about the intent and interests of intruders and malicious insiders. A key property of a honeyfile, enticement, is the extent to which the file can attract an intruder to interact with it. We introduce a novel metric, Topic Semantic Matching (TSM), which uses topic modelling to represent files in the repository and semantic matching in an embedding vector space to compare honeyfile text and topic words robustly. We also present a honeyfile corpus created with different Natural Language Processing (NLP) methods. Experiments show that TSM is effective in inter-corpus comparisons and is a promising tool to measure the enticement of honeyfiles. TSM is the first measure to use NLP techniques to quantify the enticement of honeyfile content that compares the essential topical content of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCybercrime and Law Enforcement Studies · Deception detection and forensic psychology · Network Security and Intrusion Detection
