
TL;DR
This paper introduces an improved algorithm for Compressed Counting that significantly reduces estimation variance near a = 1, enhancing its practicality for Shannon entropy monitoring in network security.
Contribution
The paper proposes a new algorithm that improves Compressed Counting's accuracy, especially when estimating Shannon entropy, and achieves statistical optimality at a = 0.5.
Findings
Reduces estimation variance by roughly 100-fold at a=0.99.
Makes Compressed Counting more practical for entropy estimation.
Achieves statistical optimality at a=0.5.
Abstract
Compressed Counting (CC) [22] was recently proposed for estimating the ath frequency moments of data streams, where 0 < a <= 2. CC can be used for estimating Shannon entropy, which can be approximated by certain functions of the ath frequency moments as a -> 1. Monitoring Shannon entropy for anomaly detection (e.g., DDoS attacks) in large networks is an important task. This paper presents a new algorithm for improving CC. The improvement is most substantial when a -> 1--. For example, when a = 0:99, the new algorithm reduces the estimation variance roughly by 100-fold. This new algorithm would make CC considerably more practical for estimating Shannon entropy. Furthermore, the new algorithm is statistically optimal when a = 0.5.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Data Stream Mining Techniques · Anomaly Detection Techniques and Applications
