Anomaly Detection in Hierarchical Data Streams under Unknown Models
Sattar Vakili, Qing Zhao, Chang Liu, Chen-Nee Chuah

TL;DR
This paper introduces an active inference method for detecting targets in hierarchical data streams with unknown, heavy-tailed distributions, optimizing sample efficiency while ensuring reliability.
Contribution
It proposes a novel biased random walk strategy on hierarchical data, achieving order optimality in sample complexity under unknown distribution models.
Findings
Strategy is order optimal in large search spaces
Applicable to heavy hitter detection and active learning
Ensures reliability with minimal samples
Abstract
We consider the problem of detecting a few targets among a large number of hierarchical data streams. The data streams are modeled as random processes with unknown and potentially heavy-tailed distributions. The objective is an active inference strategy that determines, sequentially, which data stream to collect samples from in order to minimize the sample complexity under a reliability constraint. We propose an active inference strategy that induces a biased random walk on the tree-structured hierarchy based on confidence bounds of sample statistics. We then establish its order optimality in terms of both the size of the search space (i.e., the number of data streams) and the reliability requirement. The results find applications in hierarchical heavy hitter detection, noisy group testing, and adaptive sampling for active learning, classification, and stochastic root finding.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques · Network Security and Intrusion Detection
