Amplifying Inter-message Distance: On Information Divergence Measures in Big Data
Rui She, Shanyun Liu, and Pingyi Fan (Senior Member, IEEE)

TL;DR
This paper introduces a parametric message identification divergence measure that amplifies differences between distributions, with an efficient estimation algorithm, enhancing clustering, classification, and outlier detection in big data.
Contribution
It defines a new parametric divergence measure, analyzes its properties, and develops an improved estimation algorithm for better performance in big data analysis.
Findings
Enhanced divergence measure amplifies differences between adjacent distributions.
Proposed estimator improves convergence rate of mean squared error.
Demonstrated effectiveness in clustering, classification, and outlier detection.
Abstract
Message identification (M-I) divergence is an important measure of the information distance between probability distributions, similar to Kullback-Leibler (K-L) and Renyi divergence. In fact, M-I divergence with a variable parameter can make an effect on characterization of distinction between two distributions. Furthermore, by choosing an appropriate parameter of M-I divergence, it is possible to amplify the information distance between adjacent distributions while maintaining enough gap between two nonadjacent ones. Therefore, M-I divergence can play a vital role in distinguishing distributions more clearly. In this paper, we first define a parametric M-I divergence in the view of information theory and then present its major properties. In addition, we design a M-I divergence estimation algorithm by means of the ensemble estimator of the proposed weight kernel estimators, which can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Mechanics and Entropy · Anomaly Detection Techniques and Applications
