CLIDD: Cross-Layer Independent Deformable Description for Efficient and Discriminative Local Feature Representation
Haodi Yao, Fenghua He, Ning Hao, Yao Su

TL;DR
CLIDD introduces a novel cross-layer deformable descriptor that enhances local feature discriminativeness and efficiency, enabling real-time spatial intelligence applications with minimal computational resources.
Contribution
The paper presents CLIDD, a new method combining independent feature hierarchies and hardware-aware optimization for scalable, high-performance local feature descriptors.
Findings
Ultra-compact model matches SuperPoint accuracy with 0.004M parameters.
High-performance variant exceeds 200 FPS on edge devices.
Achieves superior matching accuracy and efficiency compared to state-of-the-art methods.
Abstract
Robust local feature representations are essential for spatial intelligence tasks such as robot navigation and augmented reality. Establishing reliable correspondences requires descriptors that provide both high discriminative power and computational efficiency. To address this, we introduce Cross-Layer Independent Deformable Description (CLIDD), a method that achieves superior distinctiveness by sampling directly from independent feature hierarchies. This approach utilizes learnable offsets to capture fine-grained structural details across scales while bypassing the computational burden of unified dense representations. To ensure real-time performance, we implement a hardware-aware kernel fusion strategy that maximizes inference throughput. Furthermore, we develop a scalable framework that integrates lightweight architectures with a training protocol leveraging both metric learning and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications
