CLIDD: Cross-Layer Independent Deformable Description for Efficient and Discriminative Local Feature Representation

Haodi Yao; Fenghua He; Ning Hao; Yao Su

arXiv:2601.09230·cs.CV·January 15, 2026

CLIDD: Cross-Layer Independent Deformable Description for Efficient and Discriminative Local Feature Representation

Haodi Yao, Fenghua He, Ning Hao, Yao Su

PDF

Open Access

TL;DR

CLIDD introduces a novel cross-layer deformable descriptor that enhances local feature discriminativeness and efficiency, enabling real-time spatial intelligence applications with minimal computational resources.

Contribution

The paper presents CLIDD, a new method combining independent feature hierarchies and hardware-aware optimization for scalable, high-performance local feature descriptors.

Findings

01

Ultra-compact model matches SuperPoint accuracy with 0.004M parameters.

02

High-performance variant exceeds 200 FPS on edge devices.

03

Achieves superior matching accuracy and efficiency compared to state-of-the-art methods.

Abstract

Robust local feature representations are essential for spatial intelligence tasks such as robot navigation and augmented reality. Establishing reliable correspondences requires descriptors that provide both high discriminative power and computational efficiency. To address this, we introduce Cross-Layer Independent Deformable Description (CLIDD), a method that achieves superior distinctiveness by sampling directly from independent feature hierarchies. This approach utilizes learnable offsets to capture fine-grained structural details across scales while bypassing the computational burden of unified dense representations. To ensure real-time performance, we implement a hardware-aware kernel fusion strategy that maximizes inference throughput. Furthermore, we develop a scalable framework that integrates lightweight architectures with a training protocol leveraging both metric learning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications