Deep Kuratowski Embedding Neural Networks for Wasserstein Metric Learning
Andrew Qing He

TL;DR
This paper introduces two neural network architectures, DeepKENN and ODE-KENN, to efficiently approximate Wasserstein-2 distances, leveraging Kuratowski embedding principles and Neural ODEs for improved accuracy and regularization.
Contribution
The paper presents novel neural architectures inspired by Kuratowski embedding and Neural ODEs for fast Wasserstein distance approximation, outperforming baseline models on MNIST.
Findings
ODE-KENN achieves 28% lower test MSE than baseline.
ODE-KENN has 18% lower test MSE than DeepKENN with similar parameters.
The surrogate model effectively replaces expensive Wasserstein computations.
Abstract
Computing pairwise Wasserstein distances is a fundamental bottleneck in data analysis pipelines. Motivated by the classical Kuratowski embedding theorem, we propose two neural architectures for learning to approximate the Wasserstein-2 distance () from data. The first, DeepKENN, aggregates distances across all intermediate feature maps of a CNN using learnable positive weights. The second, ODE-KENN, replaces the discrete layer stack with a Neural ODE, embedding each input into the infinite-dimensional Banach space and providing implicit regularization via trajectory smoothness. Experiments on MNIST with exact precomputed distances show that ODE-KENN achieves a 28% lower test MSE than the single-layer baseline and 18% lower than DeepKENN under matched parameter counts, while exhibiting a smaller generalization gap. The resulting fast surrogate can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
