In a World That Counts: Clustering and Detecting Fake Social Engagement   at Scale

Yixuan Li; Oscar Martinez; Xing Chen; Yi Li; John Hopcroft

arXiv:1512.05457·cs.SI·January 21, 2016·39 cites

In a World That Counts: Clustering and Detecting Fake Social Engagement at Scale

Yixuan Li, Oscar Martinez, Xing Chen, Yi Li, John Hopcroft

PDF

Open Access

TL;DR

This paper introduces Leas, a scalable semi-supervised clustering method that detects fake social engagement on YouTube by analyzing temporal user-video interaction graphs, achieving high accuracy and speed.

Contribution

Leas is a novel, scalable graph diffusion approach utilizing local spectral clustering to identify fake engagement patterns efficiently at large scale.

Findings

01

Achieved 98% manual review accuracy on YouTube comments graph

02

Leas runs 10 times faster than CopyCatch

03

Successfully deployed at Google for real-time fake engagement detection

Abstract

How can web services that depend on user generated content discern fake social engagement activities by spammers from legitimate ones? In this paper, we focus on the social site of YouTube and the problem of identifying bad actors posting inorganic contents and inflating the count of social engagement metrics. We propose an effective method, Leas (Local Expansion at Scale), and show how the fake engagement activities on YouTube can be tracked over time by analyzing the temporal graph based on the engagement behavior pattern between users and YouTube videos. With the domain knowledge of spammer seeds, we formulate and tackle the problem in a semi-supervised manner --- with the objective of searching for individuals that have similar pattern of behavior as the known seeds --- based on a graph diffusion process via local spectral subspace. We offer a fast, scalable MapReduce deployment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Network Security and Intrusion Detection · Complex Network Analysis Techniques