Scaffoldings and Spines: Organizing High-Dimensional Data Using Cover Trees, Local Principal Component Analysis, and Persistent Homology
Paul Bendich, Ellen Gasparovic, Christopher J. Tralie, John Harer

TL;DR
The paper introduces a multi-scale method combining cover trees, local PCA, and persistent homology to organize and visualize high-dimensional data from stratified spaces, revealing their structure.
Contribution
It presents a novel multi-scale approach that constructs a scaffold and spine graph to uncover stratified structures in complex datasets.
Findings
Successfully applied to synthetic point clouds
Effectively revealed stratified structures in data
Used to analyze musical audio data
Abstract
We propose a flexible and multi-scale method for organizing, visualizing, and understanding datasets sampled from or near stratified spaces. The first part of the algorithm produces a cover tree using adaptive thresholds based on a combination of multi-scale local principal component analysis and topological data analysis. The resulting cover tree nodes consist of points within or near the same stratum of the stratified space. They are then connected to form a \emph{scaffolding} graph, which is then simplified and collapsed down into a \emph{spine} graph. From this latter graph the stratified structure becomes apparent. We demonstrate our technique on several synthetic point cloud examples and we use it to understand song structure in musical audio data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Data Visualization and Analytics · Image Retrieval and Classification Techniques
