TL;DR
PySIFT is a fully GPU-resident, deterministic implementation of classical SIFT that outperforms learned descriptors in accuracy and speed, enabling better integration with deep learning vision pipelines.
Contribution
It introduces PySIFT, the first fully GPU-resident, modular, deterministic SIFT implementation that enables controlled classical-vs-learned feature ablations.
Findings
PySIFT achieves higher accuracy than OpenCV SIFT on HPatches.
PySIFT is 383 ms faster per pair on MegaDepth.
PySIFT provides bitwise deterministic outputs across runs and architectures.
Abstract
A widespread assumption in local feature research holds that classical handcrafted descriptors are accuracy-limited relics best replaced by learned alternatives. We show this is wrong. Through an 8-configuration ablation spanning four benchmarks (HPatches, ROxford5K, IMC Phototourism, MegaDepth), we demonstrate that classical SIFT with DSP multi-scale pooling outperforms neural descriptor and orientation replacements (HardNet, OriNet) on every accuracy metric--while running 2--18 faster--and that learned matchers (LightGlue) complement rather than supersede classical features. The conclusion reframes a decade of work: not "replace SIFT" but "compose with SIFT," classical extraction paired with learned matching only where geometric context demands it. This finding was invisible because no prior GPU SIFT kept the complete pipeline in VRAM or offered modularity for controlled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
