A Tidy Data Structure and Visualisations for Multiple Variable Correlations and Other Pairwise Scores
Amit Chinwan, Catherine B. Hurley

TL;DR
This paper introduces a comprehensive pipeline and visualisation tools for analyzing and displaying multiple pairwise association scores in data, enhancing interpretability over traditional methods.
Contribution
It presents a unified interface and a tidy data structure for calculating and visualising various pairwise scores, including novel visualisations for complex relationships.
Findings
Rich visualisations reveal non-linear and categorical relationships.
The pipeline handles multiple scores simultaneously.
New methods outperform traditional heatmaps in complex scenarios.
Abstract
We provide a pipeline for calculating, managing and visualising correlations and other pairwise association scores for numerical and categorical data. We present a uniform interface for calculating a plethora of pairwise scores and propose a tidy data structure for organising the results. We also provide new visualisations which simultaneously show multiple and/or grouped pairwise scores. The visualisations are far richer than a traditional heatmap of correlation scores, as they help identify relationships with categorical variables, numeric variable pairs with non-linear associations or those which exhibit Simpson's paradox. These methods are available in our R package bullseye.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Applications
