Magnitude Distance: A Geometric Measure of Dataset Similarity
Sahel Torkamani, Henry Gouk, Rik Sarkar

TL;DR
This paper introduces magnitude distance, a new geometric metric for dataset similarity that adapts to different scales and remains effective in high-dimensional spaces, with applications in training generative models.
Contribution
The paper proposes magnitude distance, a novel dataset distance metric based on the magnitude of metric spaces, with theoretical properties and practical use in generative model training.
Findings
Magnitude distance is discriminative in high-dimensional settings.
It has desirable theoretical properties across scales.
It performs comparably to established distances in generative tasks.
Abstract
Quantifying the distance between datasets is a fundamental question in mathematics and machine learning. We propose \textit{magnitude distance}, a novel distance metric defined on finite datasets using the notion of the \emph{magnitude} of a metric space. The proposed distance incorporates a tunable scaling parameter, , that controls the sensitivity to global structure (small ) and finer details (large ). We prove several theoretical properties of magnitude distance, including its limiting behavior across scales and conditions under which it satisfies key metric properties. In contrast to classical distances, we show that magnitude distance remains discriminative in high-dimensional settings when the scale is appropriately tuned. We further demonstrate how magnitude distance can be used as a training objective for push-forward generative models. Our experimental results support…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
