Minimax Rates for Hyperbolic Hierarchical Learning

Divit Rawal; Sriram Vishwanath

arXiv:2601.20047·stat.ML·January 29, 2026

Minimax Rates for Hyperbolic Hierarchical Learning

Divit Rawal, Sriram Vishwanath

PDF

Open Access

TL;DR

This paper demonstrates that hyperbolic embeddings significantly reduce sample complexity for hierarchical learning tasks compared to Euclidean embeddings, achieving optimal rates through geometric advantages.

Contribution

It proves an exponential separation in sample complexity between Euclidean and hyperbolic spaces for hierarchical data, establishing hyperbolic space as optimal for such tasks.

Findings

01

Hyperbolic embeddings enable $O(1)$-Lipschitz realizability for hierarchical data.

02

Euclidean embeddings require exponential Lipschitz constants, leading to higher sample complexity.

03

Matching lower bounds confirm hyperbolic space's optimality in hierarchical learning.

Abstract

We prove an exponential separation in sample complexity between Euclidean and hyperbolic representations for learning on hierarchical data under standard Lipschitz regularization. For depth- $R$ hierarchies with branching factor $m$ , we first establish a geometric obstruction for Euclidean space: any bounded-radius embedding forces volumetric collapse, mapping exponentially many tree-distant points to nearby locations. This necessitates Lipschitz constants scaling as $exp (Ω (R))$ to realize even simple hierarchical targets, yielding exponential sample complexity under capacity control. We then show this obstruction vanishes in hyperbolic space: constant-distortion hyperbolic embeddings admit $O (1)$ -Lipschitz realizability, enabling learning with $n = O (m R lo g m)$ samples. A matching $Ω (m R lo g m)$ lower bound via Fano's inequality establishes that hyperbolic representations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms