Semantic Alignment in Hyperbolic Space for Open-Vocabulary Semantic Segmentation
Hoang M. Truong, Hai Nguyen-Truong, Dang Huynh

TL;DR
This paper introduces HyRo, a hyperbolic fine-tuning framework that improves open-vocabulary semantic segmentation by separately aligning hierarchical and semantic relationships in hyperbolic space, achieving state-of-the-art results.
Contribution
HyRo uniquely decouples hierarchical and semantic alignment in hyperbolic space, enhancing dense prediction performance in open-vocabulary segmentation tasks.
Findings
HyRo outperforms previous methods on standard benchmarks.
Hyperbolic radius adjustment improves hierarchical level alignment.
Angular semantic refinement enhances semantic relationship modeling.
Abstract
Open-vocabulary semantic segmentation requires adapting image-level vision-language models such as CLIP to dense pixel-level prediction, which is challenging due to the mismatch between hierarchical structure and semantic alignment in the embedding space. While recent works leverage hyperbolic geometry to model hierarchical relationships, they align embeddings across hierarchical levels but overlook semantic misalignment among embeddings within the same level. In this work, we propose HyRo, a hyperbolic fine-tuning framework that decouples hierarchical and semantic alignment in the Poincar\'e ball model. HyRo aligns hierarchical levels by adjusting the hyperbolic radius and refines semantic relationships through angular alignment using an orthogonal transformation that theoretically preserves the hyperbolic radius. Experiments on standard open-vocabulary semantic segmentation benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
