SpatioTemporal Difference Network for Video Depth Super-Resolution
Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, Jian Yang

TL;DR
This paper introduces STDNet, a novel network for video depth super-resolution that effectively addresses long-tailed spatial and temporal variations by using spatial and temporal difference mechanisms, leading to improved reconstruction quality.
Contribution
The paper proposes a SpatioTemporal Difference Network with spatial and temporal difference branches to mitigate long-tailed effects in video depth super-resolution, a novel approach in this domain.
Findings
Outperforms existing methods on multiple datasets.
Effectively mitigates long-tailed spatial and temporal effects.
Enhances depth reconstruction accuracy.
Abstract
Depth super-resolution has achieved impressive performance, and the incorporation of multi-frame information further enhances reconstruction quality. Nevertheless, statistical analyses reveal that video depth super-resolution remains affected by pronounced long-tailed distributions, with the long-tailed effects primarily manifesting in spatial non-smooth regions and temporal variation zones. To address these challenges, we propose a novel SpatioTemporal Difference Network (STDNet) comprising two core branches: a spatial difference branch and a temporal difference branch. In the spatial difference branch, we introduce a spatial difference mechanism to mitigate the long-tailed issues in spatial non-smooth regions. This mechanism dynamically aligns RGB features with learned spatial difference representations, enabling intra-frame RGB-D aggregation for depth calibration. In the temporal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies
