Learning conditional distributions on continuous spaces

Cyril B\'en\'ezet; Ziteng Cheng; Sebastian Jaimungal

arXiv:2406.09375·stat.ML·June 14, 2024

Learning conditional distributions on continuous spaces

Cyril B\'en\'ezet, Ziteng Cheng, Sebastian Jaimungal

PDF

Open Access 1 Repo

TL;DR

This paper introduces methods for learning conditional distributions on continuous multi-dimensional spaces using clustering techniques, providing theoretical convergence bounds and practical neural network training improvements.

Contribution

It proposes two clustering-based approaches for learning conditional distributions, establishes their convergence rates, and integrates the nearest neighbors method into neural network training with efficiency enhancements.

Findings

01

Nearest neighbors method outperforms fixed-radius in practice

02

Neural networks can adapt to local Lipschitz continuity levels

03

Efficient training with approximate nearest neighbors and Sinkhorn algorithm

Abstract

We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes, allowing for different dimensions of the feature and target spaces. Our approach involves clustering data near varying query points in the feature space to create empirical measures in the target space. We employ two distinct clustering schemes: one based on a fixed-radius ball and the other on nearest neighbors. We establish upper bounds for the convergence rates of both methods and, from these bounds, deduce optimal configurations for the radius and the number of neighbors. We propose to incorporate the nearest neighbors method into neural network training, as our empirical analysis indicates it has better performance in practice. For efficiency, our training process utilizes approximate nearest neighbors search with random binary space partitioning. Additionally, we employ the Sinkhorn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zcheng-a/lcd_knn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Bayesian Methods and Mixture Models · Advanced Clustering Algorithms Research