DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry

Cheng Liao

arXiv:2511.12653·cs.CV·November 18, 2025

DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry

Cheng Liao

PDF

Open Access

TL;DR

This paper introduces DPVO-QAT++, a framework that combines heterogeneous quantization and CUDA kernel fusion to significantly accelerate deep visual odometry while maintaining accuracy, enabling deployment on resource-limited platforms.

Contribution

It proposes a novel hierarchical quantization and kernel fusion approach that reduces memory and computation in deep visual odometry without sacrificing accuracy.

Findings

01

52.1% FPS increase on TartanAir dataset

02

29.1% median latency reduction on TartanAir

03

64.9% GPU memory reduction on TartanAir

Abstract

Deep learning-based Visual SLAM (vSLAM) systems exhibit exceptional geometric reasoning capabilities, yet their prohibitive computational overhead severely restricts deployment on resource-constrained autonomous platforms. This paper presents a hierarchical quantization optimization framework, DPVO-QAT++ (DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry). Through the synergistic integration of learnable scale parameterization, a heterogeneous precision design for the Visual Odometry (VO) front-end and back-end (front-end floating-point fake quantization with FP16/FP32; back-end full precision), and GPU-native kernel fusion for fake quantization (custom CUDA kernels), our framework significantly reduces memory footprint and increases processing speed while preserving the trajectory accuracy of the original model. On the TartanAir…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques