TL;DR
This paper introduces a semantic analysis framework that detects compromised social media accounts by identifying linguistic inconsistencies using language models, demonstrating effectiveness on Twitter data.
Contribution
It presents a novel semantic feature extraction method based on language model divergence to identify compromised accounts, which is interpretable and effective.
Findings
KL-divergence-based feature performs best
Framework achieves high detection accuracy
Semantic features effectively distinguish normal and compromised accounts
Abstract
Compromised accounts on social networks are regular user accounts that have been taken over by an entity with malicious intent. Since the adversary exploits the already established trust of a compromised account, it is crucial to detect these accounts to limit the damage they can cause. We propose a novel general framework for semantic analysis of text messages coming out from an account to detect compromised accounts. Our framework is built on the observation that normal users will use language that is measurably different from the language that an adversary would use when the account is compromised. We propose to use the difference of language models of users and adversaries to define novel interpretable semantic features for measuring semantic incoherence in a message stream. We study the effectiveness of the proposed semantic features using a Twitter data set. Evaluation results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
