Measuring and Discovering Correlations in Large Data Sets
Lijue Liu, Ming Li, Sha Wen

TL;DR
This paper introduces ART, a new nonparametric statistical method for efficiently measuring and discovering both linear and nonlinear correlations in large datasets, overcoming previous limitations.
Contribution
The paper presents ART, a novel class of statistics that accurately and efficiently detects diverse correlations without prior knowledge of their types.
Findings
Successfully analyzed 10 American classical indexes
Discovered numerous bi-variable correlations
Demonstrated ART's efficiency and effectiveness
Abstract
In this paper, a class of statistics named ART (the alternant recursive topology statistics) is proposed to measure the properties of correlation between two variables. A wide range of bi-variable correlations both linear and nonlinear can be evaluated by ART efficiently and equitably even if nothing is known about the specific types of those relationships. ART compensates the disadvantages of Reshef's model in which no polynomial time precise algorithm exists and the "local random" phenomenon can not be identified. As a class of nonparametric exploration statistics, ART is applied for analyzing a dataset of 10 American classical indexes, as a result, lots of bi-variable correlations are discovered.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Geochemistry and Geologic Mapping · Data Visualization and Analytics
