Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang; Ram Samarth B B; Aosong Feng; Bo Xiong; Jihong Liu; Irwin King; Rex Ying

arXiv:2410.04010·cs.LG·February 9, 2026

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang, Ram Samarth B B, Aosong Feng, Bo Xiong, Jihong Liu, Irwin King, Rex Ying

PDF

Open Access 1 Repo

TL;DR

This paper explores the hyperbolic geometry of language model embeddings, revealing hierarchical structures, and introduces HypLoRA, a hyperbolic fine-tuning method that enhances large language model performance on reasoning tasks.

Contribution

The paper uncovers hyperbolic properties in LLM embeddings and proposes HypLoRA, a novel hyperbolic fine-tuning approach that leverages these structures for improved performance.

Findings

01

Hyperbolic characteristics are present in token embeddings.

02

HypLoRA outperforms traditional fine-tuning methods.

03

Performance improvements on reasoning benchmarks.

Abstract

Large language models (LLMs) have demonstrated remarkable performance across various tasks. However, it remains an open question whether the default Euclidean space is the most suitable choice for LLMs. In this study, we investigate the geometric characteristics of LLMs, focusing specifically on tokens and their embeddings. Our findings reveal that token frequency follows a power-law distribution, where high-frequency tokens (e.g., the, that ) constitute the minority, while low-frequency tokens (e.g., apple, dog) constitute the majority. Furthermore, high-frequency tokens cluster near the origin, whereas low-frequency tokens are positioned farther away in the embedding space. Additionally, token embeddings exhibit hyperbolic characteristics, indicating a latent tree-like structure within the embedding space. Motivated by these observations, we propose HypLoRA, an efficient fine-tuning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marlin-codes/HypLLM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling