DiRe-RAPIDS: Topology-faithful dimensionality reduction at scale

Alexander Kolpakov; Igor Rivin

arXiv:2604.25209·cs.LG·April 30, 2026

DiRe-RAPIDS: Topology-faithful dimensionality reduction at scale

Alexander Kolpakov, Igor Rivin

PDF

TL;DR

DiRe-RAPIDS is a scalable dimensionality reduction method that better preserves global topology and topological features of high-dimensional data compared to UMAP, especially at large scales.

Contribution

The paper introduces a topology-faithfulness benchmark and demonstrates DiRe-RAPIDS's superior ability to preserve topological structures at scale.

Findings

01

DiRe matches or exceeds GPU-accelerated UMAP in classification tasks.

02

DiRe recovers exact first Betti numbers on stress tests.

03

DiRe preserves 3-4 times more topological structure than UMAP on large datasets.

Abstract

Dimensionality reduction methods such as UMAP and t-SNE are central tools for visualising high-dimensional data, but their local-neighborhood objectives can preserve sampling noise while distorting global topology. We show that standard local metrics reward this noise memorisation: top-performing embeddings invent cycles and disconnected islands absent from the data. We introduce a topology-faithfulness benchmark based on noisy manifolds with known homology, tune DiRe against it, and find Pareto-optimal configurations that match or beat GPU-accelerated UMAP on classification while recovering exact first Betti numbers on stress tests. On 723K arXiv paper embeddings, DiRe preserves 3-4 times more topological structure than UMAP at comparable wall-clock.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.