Hyperbolic Distillation: Geometry-Guided Cross-Modal Transfer for Robust 3D Object Detection

Kanglin Ning; Wenrui Li; Houde Quan; Qifan Li; Xingtao Wang; Xiaopeng Fan

arXiv:2605.09899·cs.CV·May 12, 2026

Hyperbolic Distillation: Geometry-Guided Cross-Modal Transfer for Robust 3D Object Detection

Kanglin Ning, Wenrui Li, Houde Quan, Qifan Li, Xingtao Wang, Xiaopeng Fan

PDF

TL;DR

This paper introduces HGC-Det, a hyperbolic geometry-guided cross-modal distillation framework that enhances 3D object detection by effectively fusing point cloud and image features.

Contribution

The paper proposes a novel hyperbolic constrained distillation method with components for semantic-guided voxel optimization, hyperbolic feature transfer, and geometry-based feature aggregation.

Findings

01

Achieves improved detection accuracy on SUN RGB-D, ARKitScenes, KITTI, and nuScenes datasets.

02

Effectively mitigates semantic loss during feature fusion.

03

Balances detection performance with computational efficiency.

Abstract

Cross-modal knowledge distillation has emerged as an effective strategy for integrating point cloud and image features in 3D perception tasks. However, the modality heterogeneity, spatial misalignment, and the representation crisis of multiple modalities often limit the efficient of these cross-modal distillation methods. To address these limitations in existing approaches, we propose a hyperbolic constrained cross-modal distillation method for multimodal 3D object detection (HGC-Det). The proposed HGC-Det framework includes an image branch and a point cloud branch to extract semantic features from two different modalities. The point cloud branch comprises three core components: a 2D semantic-guided voxel optimization component (SGVO), a hyperbolic geometry constrained cross-modal feature transfer component (HFT), and a feature aggregation-based geometry optimization component (FAGO).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.