M3A: Model, MetaModel, and Anomaly Detection in Web Searches
Da-Cheng Juan, Neil Shah, Mingyu Tang, Zhiliang Qian, Diana, Marculescu, Christos Faloutsos

TL;DR
This paper introduces M3A, a novel framework combining models at user and group levels to accurately analyze web search behaviors, identify patterns, and detect anomalies using a large-scale query log dataset.
Contribution
The paper presents Camel-Log and Meta-Click models for detailed analysis of search behavior patterns and anomaly detection, a novel approach at both user and group levels.
Findings
Discovery of a bi-modal IAT pattern in search queries.
Correlation among model parameters at the group level.
Effective anomaly detection in large-scale query logs.
Abstract
'Alice' is submitting one web search per five minutes, for three hours in a row - is it normal? How to detect abnormal search behaviors, among Alice and other users? Is there any distinct pattern in Alice's (or other users') search behavior? We studied what is probably the largest, publicly available, query log that contains more than 30 million queries from 0.6 million users. In this paper, we present a novel, user-and group-level framework, M3A: Model, MetaModel and Anomaly detection. For each user, we discover and explain a surprising, bi-modal pattern of the inter-arrival time (IAT) of landed queries (queries with user click-through). Specifically, the model Camel-Log is proposed to describe such an IAT distribution; we then notice the correlations among its parameters at the group level. Thus, we further propose the metamodel Meta-Click, to capture and explain the two-dimensional,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Data Stream Mining Techniques · Data Management and Algorithms
