Weighted Reverse Convolution for Feature Upsampling
Wentong Li, Zhiyuan Qi, Zichen Zhao, Kai Zhang, Lei Zhang

TL;DR
This paper introduces Weighted Reverse Convolution (WRC), a novel spatially adaptive inverse operator for feature upsampling in vision models, improving dense feature quality across multiple tasks with efficiency.
Contribution
The paper proposes WRC, a new inverse problem-based upsampling method with a closed-form FFT solution, enhancing feature densification in vision models.
Findings
WRC improves dense feature quality in segmentation, depth estimation, and keypoint correspondence.
WRC maintains high computational efficiency and is a practical drop-in operator.
WRC consistently outperforms existing upsampling methods across various benchmarks.
Abstract
Pre-trained vision foundation models (VFMs) provide strong semantic representations, yet their patch-level features are inherently coarse, limiting their effectiveness on tasks requiring fine-grained localization, dense prediction, and point-wise correspondence. In this work, we revisit feature upsampling for VFMs from the perspective of \textbf{\textit{inverse problem}} and propose Weighted Reverse Convolution (WRC), a spatially adaptive inverse operator for densifying high-level visual descriptors. Specifically, we formulate feature upsampling as a weighted Tikhonov-regularized least-squares problem, where spatially varying weights modulate both data fidelity and prior strength at each spatial location. This allows WRC to adapt the reconstruction to spatially varying feature characteristics, thereby preserving critical structures while mitigating over-smoothing. Moreover, WRC retains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
