Sparse Bayesian multidimensional scaling(s)
Ami Sheth, Aaron Smith, Andrew J. Holbrook

TL;DR
This paper introduces sparse variants of Bayesian multidimensional scaling (BMDS) that significantly reduce computational complexity, enabling scalable analysis of large dissimilarity datasets while maintaining accuracy, with applications in phylogeography and document clustering.
Contribution
The authors develop and compare two sparse BMDS methods, L-sBMDS and B-sBMDS, that apply subset-based likelihood calculations to improve scalability and efficiency.
Findings
Achieve up to 40-fold speedup with negligible accuracy loss
Prove posterior consistency under simplified conditions
Demonstrate practical applications in influenza phylogeography and manuscript clustering
Abstract
Bayesian multidimensional scaling (BMDS) is a probabilistic dimension reduction tool that allows one to model and visualize data consisting of dissimilarities between pairs of objects. Although BMDS has proven useful within, e.g., Bayesian phylogenetic inference, its likelihood and gradient calculations require a burdensome order of floating-point operations, where is the number of data points. Thus, BMDS becomes impractical as grows large. We propose and compare two sparse versions of BMDS (sBMDS) that apply log-likelihood and gradient computations to subsets of the observed dissimilarity matrix data. Landmark sBMDS (L-sBMDS) extracts columns, while banded sBMDS (B-sBMDS) extracts diagonals of the data. These sparse variants let one specify a time complexity between and . Under simplified settings, we prove posterior consistency for subsampled distance matrices.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
