TL;DR
This paper introduces a geometry-aware, adaptive distance measure for hyperbolic space learning that dynamically adjusts to diverse hierarchical data structures, improving performance across various classification tasks.
Contribution
It proposes a novel adaptive distance measure with tailored projections and curvatures, along with a low-rank decomposition scheme and theoretical error bounds, enhancing hyperbolic learning for diverse hierarchies.
Findings
Outperforms fixed-distance hyperbolic methods on standard datasets
Achieves over 5% accuracy gains in few-shot learning tasks
Provides clearer class boundaries and better prototype separation
Abstract
Learning in hyperbolic spaces has attracted increasing attention due to its superior ability to model hierarchical structures of data. Most existing hyperbolic learning methods use fixed distance measures for all data, assuming a uniform hierarchy across all data points. However, real-world hierarchical structures exhibit significant diversity, making this assumption overly restrictive. In this paper, we propose a geometry-aware distance measure in hyperbolic spaces, which dynamically adapts to varying hierarchical structures. Our approach derives the distance measure by generating tailored projections and curvatures for each pair of data points, effectively mapping them to an appropriate hyperbolic space. We introduce a revised low-rank decomposition scheme and a hard-pair mining mechanism to mitigate the computational cost of pair-wise distance computation without compromising…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
## Strengths 1. The key motivation of the paper to adaptively learn distance metrics for different hierarchies is an interesting and relevant problem for the hyperbolic learning community. 2. The proposed solution is novel and intuitive. 3. Theoretical analysis for the method is provided using the Talagrand concentration inequality which is helpful for further analysis in the field. 4. Visualizations on different datasets are helpful in understanding the effectiveness of the learning al
## Weaknesses 1. The authors mention that their work is inspired by the Gu et. al, 2019 paper in section 2.3 (L141-147) - Learning mixed curvature representations in Products of Model Spaces - the nature of the work is similar in terms of learning adaptive curvatures however the authors do not provide a comparison with this method in their experiments 2. There are several details about the experimental setup and comparisons which are unclear from the main paper and the Appendix. The class
1. The motivation of the paper, considering the diversity of the hierarchical structures of data, is nice. 2. As far as I know, the proposed model is novel.
Overall, despite the interesting starting motivations, the current manuscript fails to include the necessary definitions for readers to understand the Authors' idea. Each of the following points is, regrettably, fatal and also prevents us from judging the validity of other parts, such as experiment parts since, in general, experiments are conducted to verify whether the Authors idea is correct or not. 1. The specific forms of the matrix generator $g\_{t}$ and curvature generator $g\_{c}$ are not
1.Dynamic learning of various hierarchical structures in hyperbolic space is the core innovation of this work. 2.The article is logically coherent, with rigorous expression and solid theoretical derivation.
1.There are several inaccuracies in the mathematical expressions throughout the article that require careful review and correction by the authors. 2.How was the hierarchical structure of the dataset used in the experiments generated? 3.In Table 3, the absence of the latest comparison algorithms from this year diminishes the persuasive power of the experiments.
Overall quality is good, and the writing is clear. I appreciate the author providing extensive experiments to demonstrate the effectiveness of their methodology, which adds to the soundness of their approach. The novelty is fine for the topic of learning hyperbolic embeddings. The low rank approximation theoretical analysis will be a novel framework in this field.
- The motivation of some operations is questionable: - Why is learning projection matrix M_ij necessary when from Fig. 1, the main difference between (c) and (d) is the curvature? Could you provide a more detailed justification? - How to compare with other methods with curvature c as a single learnable parameter? How to compare with other methods with curvature c as a tunable hyperparameter? I suggest authors include these comparisons in the experimental section (or feel free to point these
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
