Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein

Hugues Van Assel; C\'edric Vincent-Cuaz; Nicolas Courty; R\'emi Flamary; Pascal Frossard; Titouan Vayer

arXiv:2402.02239·cs.LG·June 30, 2025·1 cites

Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein

Hugues Van Assel, C\'edric Vincent-Cuaz, Nicolas Courty, R\'emi Flamary, Pascal Frossard, Titouan Vayer

PDF

Open Access

TL;DR

This paper introduces distributional reduction, a unified framework based on Gromov-Wasserstein optimal transport, that combines dimensionality reduction and clustering into a single optimization approach, demonstrated on image and genomic data.

Contribution

It presents a novel unifying framework that integrates dimensionality reduction and clustering through optimal transport, specifically Gromov-Wasserstein, allowing joint analysis of data structure.

Findings

01

Effectively identifies low-dimensional prototypes across datasets

02

Unifies DR and clustering within a single optimization framework

03

Demonstrates applicability on image and genomic datasets

Abstract

Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets. Traditionally, this involves using dimensionality reduction (DR) methods to project data onto lower-dimensional spaces or organizing points into meaningful clusters (clustering). In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem. This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem. We empirically demonstrate its relevance to the identification of low-dimensional prototypes representing data at different scales, across multiple image and genomic datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models