Sparse Bayesian multidimensional scaling(s)

Ami Sheth; Aaron Smith; Andrew J. Holbrook

arXiv:2406.15573·stat.ME·May 23, 2025·Comput. Stat.

Sparse Bayesian multidimensional scaling(s)

Ami Sheth, Aaron Smith, Andrew J. Holbrook

PDF

Open Access 1 Repo

TL;DR

This paper introduces sparse variants of Bayesian multidimensional scaling (BMDS) that significantly reduce computational complexity, enabling scalable analysis of large dissimilarity datasets while maintaining accuracy, with applications in phylogeography and document clustering.

Contribution

The authors develop and compare two sparse BMDS methods, L-sBMDS and B-sBMDS, that apply subset-based likelihood calculations to improve scalability and efficiency.

Findings

01

Achieve up to 40-fold speedup with negligible accuracy loss

02

Prove posterior consistency under simplified conditions

03

Demonstrate practical applications in influenza phylogeography and manuscript clustering

Abstract

Bayesian multidimensional scaling (BMDS) is a probabilistic dimension reduction tool that allows one to model and visualize data consisting of dissimilarities between pairs of objects. Although BMDS has proven useful within, e.g., Bayesian phylogenetic inference, its likelihood and gradient calculations require a burdensome order of $N^{2}$ floating-point operations, where $N$ is the number of data points. Thus, BMDS becomes impractical as $N$ grows large. We propose and compare two sparse versions of BMDS (sBMDS) that apply log-likelihood and gradient computations to subsets of the observed dissimilarity matrix data. Landmark sBMDS (L-sBMDS) extracts columns, while banded sBMDS (B-sBMDS) extracts diagonals of the data. These sparse variants let one specify a time complexity between $N^{2}$ and $N$ . Under simplified settings, we prove posterior consistency for subsampled distance matrices.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

andrewjholbrook/sparseBMDS
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research