Discovering and Visualizing Hierarchy in Multivariate Data
Kun Yang, Wing Hung Wong

TL;DR
This paper introduces a non-parametric method to discover and visualize hierarchical structures in multidimensional data by approximating the joint density with a binary partition and constructing a tree representation, demonstrated on flow cytometry and social network data.
Contribution
It presents a novel, assumption-free approach for uncovering and visualizing data hierarchies through density approximation and tree construction, enabling multi-resolution data summaries.
Findings
Effective in revealing hierarchical structures in complex data
Provides multi-resolution insights into data organization
Applicable to diverse data types like flow cytometry and social networks
Abstract
How to extract useful insights from data is always a challenge, especially if the data is multidimensional. Often, the data can be organized according to certain hierarchical structure that are stemmed either from data collection process or from the information and phenomena carried by the data itself. The current study attempts to discover and visualize these underlying hierarchies. By regarding each observation in the data as a draw from a (hypothetical) multidimensional joint density, our first goal is to approximate this unknown density with a piecewise constant function via binary partition, our non-parametric approach makes no assumptions on the form of the density. Given the piecewise constant density function and its corresponding binary partition, our second goal is to construct a connected graph and build up a tree representation of the data by level sets. To demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Network Analysis Techniques · Data Visualization and Analytics · Bioinformatics and Genomic Networks
