On the performance overhead tradeoff of distributed principal component analysis via data partitioning
Ni An, Steven Weber

TL;DR
This paper evaluates the tradeoff between communication overhead and accuracy in distributed PCA algorithms applied to network anomaly detection, demonstrating significant bandwidth savings with minimal loss in detection performance.
Contribution
It provides an empirical comparison of two distributed PCA algorithms, highlighting their efficiency and effectiveness in large-scale network anomaly detection.
Findings
Distributed PCA reduces communication bandwidth significantly.
Detection accuracy remains high with distributed PCA.
Tradeoff analysis guides practical deployment in networks.
Abstract
Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for large distributed systems, on account of the large network bandwidth cost required to gather the distributed state at a fusion center. Consequently, several recent works have proposed various distributed PCA algorithms aiming to reduce the communication overhead incurred by PCA without losing its inferential power. This paper evaluates the tradeoff between communication cost and solution quality of two distributed PCA algorithms on a real domain name system (DNS) query dataset from a large network. We also apply the distributed PCA algorithm in the area of network anomaly detection and demonstrate that the detection accuracy of both distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Fault Detection and Control Systems
