Pip-Stereo: Progressive Iterations Pruner for Iterative Optimization based Stereo Matching
Jintu Zheng, Qizhe Liu, HuangXin Xu, Zhuojie Chen

TL;DR
Pip-Stereo introduces a novel, efficient stereo matching framework that reduces redundant computations and leverages hardware-aware RNNs, enabling real-time high-fidelity depth estimation on edge devices with minimal accuracy loss.
Contribution
The paper proposes a progressive iteration pruning strategy, a monocular prior transfer framework, and FlashGRU, a hardware-aware RNN operator, to significantly improve efficiency and real-time performance of iterative stereo matching.
Findings
Achieves 75ms processing on NVIDIA Jetson Orin NX at 640x320 resolution.
Reduces memory peak by 76.6% and memory requests by 80.9% compared to native ConvGRUs.
Maintains high accuracy comparable to large iterative models.
Abstract
While iterative stereo matching achieves high accuracy, its dependence on Recurrent Neural Networks (RNN) hinders edge deployment, a challenge underexplored in existing researches. We analyze iterative refinement and reveal that disparity updates are spatially sparse and temporally redundant. First, we introduce a progressive iteration pruning strategy that suppresses redundant update steps, effectively collapsing the recursive computation into a near-single-pass inference. Second, we propose a collaborative monocular prior transfer framework that implicitly embeds depth priors without requiring a dedicated monocular encoder, thereby eliminating its associated computational burden. Third, we develop FlashGRU, a hardware-aware RNN operator leveraging structured sparsity and I/O-conscious design, achieving a 7.28 speedup, 76.6\% memory peak reduction and 80.9\% global memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
