G-Mapper: Learning a Cover in the Mapper Construction
Enrique Alvarado, Robin Belton, Emily Fischer, Kang-Ju Lee, Sourabh, Palande, Sarah Percival, Emilie Purvine

TL;DR
This paper introduces G-Mapper, an algorithm that optimizes the cover in Mapper construction by using statistical tests and clustering, resulting in more accurate and faster topological data analysis visualizations.
Contribution
The paper proposes a novel cover selection algorithm for Mapper that employs G-means clustering and Gaussian mixture models, improving efficiency and quality of the resulting graphs.
Findings
Generates Mapper graphs that better reflect dataset structure.
Runs significantly faster than previous methods.
Maintains the essential features of datasets in the graphs.
Abstract
The Mapper algorithm is a visualization technique in topological data analysis (TDA) that outputs a graph reflecting the structure of a given dataset. However, the Mapper algorithm requires tuning several parameters in order to generate a ``nice" Mapper graph. This paper focuses on selecting the cover parameter. We present an algorithm that optimizes the cover of a Mapper graph by splitting a cover repeatedly according to a statistical test for normality. Our algorithm is based on G-means clustering which searches for the optimal number of clusters in -means by iteratively applying the Anderson-Darling test. Our splitting procedure employs a Gaussian mixture model to carefully choose the cover according to the distribution of the given data. Experiments for synthetic and real-world datasets demonstrate that our algorithm generates covers so that the Mapper graphs retain the essence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Advanced Clustering Algorithms Research · Data Management and Algorithms
