Fixed and adaptive landmark sets for finite pseudometric spaces
Jason Cory Brunson, Yara Skaf

TL;DR
This paper introduces 'lastfirst', a new landmark sampling method for topological data analysis that handles variable density and multiplicities better than existing methods, improving biomedical data analysis.
Contribution
The paper defines and proves properties of 'lastfirst', a landmark sampling procedure based on ranked distances, and compares it to maxmin in biomedical data tasks.
Findings
Lastfirst outperforms maxmin in homology detection.
Lastfirst achieves comparable results in feature detection and class prediction.
Lastfirst is applicable to any data with arbitrary pairwise distances.
Abstract
Topological data analysis (TDA) is an expanding field that leverages principles and tools from algebraic topology to quantify structural features of data sets or transform them into more manageable forms. As its theoretical foundations have been developed, TDA has shown promise in extracting useful information from high-dimensional, noisy, and complex data such as those used in biomedicine. To improve efficiency, these techniques may employ landmark samplers. The heuristic maxmin procedure obtains a roughly even distribution of sample points by implicitly constructing a cover comprising sets of uniform radius. However, issues arise with data that vary in density or include points with multiplicities, as are common in biomedicine. We propose an analogous procedure, "lastfirst" based on ranked distances, which implies a cover comprising sets of uniform cardinality. We first rigorously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis
