Kullback-Leibler distance as a measure of the information filtered from multivariate data
Michele Tumminello, Fabrizio Lillo, Rosario Nunzio Mantegna

TL;DR
This paper evaluates the Kullback-Leibler distance as a tool for measuring the information content and stability of correlation matrices derived from multivariate Gaussian data, comparing filtering methods through simulations and empirical data.
Contribution
It introduces a method using the Kullback-Leibler distance to quantify information and stability in correlation matrix filtering procedures, with analytical and empirical validation.
Findings
Spectral analysis techniques are more informative about the correlation matrix.
Hierarchical clustering methods are more stable under statistical uncertainty.
The proposed method effectively compares filtering procedures in terms of information and stability.
Abstract
We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to statistical uncertainty. We explain the effectiveness of our method by comparing four filtering procedures, two of them being based on spectral analysis and the other two on hierarchical clustering. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics
