SpatioTemporal Difference Network for Video Depth Super-Resolution

Zhengxue Wang; Yuan Wu; Xiang Li; Zhiqiang Yan; Jian Yang

arXiv:2508.01259·cs.CV·November 12, 2025

SpatioTemporal Difference Network for Video Depth Super-Resolution

Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, Jian Yang

PDF

Open Access 1 Video

TL;DR

This paper introduces STDNet, a novel network for video depth super-resolution that effectively addresses long-tailed spatial and temporal variations by using spatial and temporal difference mechanisms, leading to improved reconstruction quality.

Contribution

The paper proposes a SpatioTemporal Difference Network with spatial and temporal difference branches to mitigate long-tailed effects in video depth super-resolution, a novel approach in this domain.

Findings

01

Outperforms existing methods on multiple datasets.

02

Effectively mitigates long-tailed spatial and temporal effects.

03

Enhances depth reconstruction accuracy.

Abstract

Depth super-resolution has achieved impressive performance, and the incorporation of multi-frame information further enhances reconstruction quality. Nevertheless, statistical analyses reveal that video depth super-resolution remains affected by pronounced long-tailed distributions, with the long-tailed effects primarily manifesting in spatial non-smooth regions and temporal variation zones. To address these challenges, we propose a novel SpatioTemporal Difference Network (STDNet) comprising two core branches: a spatial difference branch and a temporal difference branch. In the spatial difference branch, we introduce a spatial difference mechanism to mitigate the long-tailed issues in spatial non-smooth regions. This mechanism dynamically aligns RGB features with learned spatial difference representations, enabling intra-frame RGB-D aggregation for depth calibration. In the temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

SpatioTemporal Difference Network for Video Depth Super-Resolution· underline

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Video Coding and Compression Technologies