Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling

Minseok Seo; Mark Hamilton; Changick Kim

arXiv:2511.16301·cs.CV·November 25, 2025

Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling

Minseok Seo, Mark Hamilton, Changick Kim

PDF

Open Access

TL;DR

Upsample Anything introduces a simple, universal, and highly effective test-time optimization method that restores low-resolution features to high-resolution outputs without retraining, improving pixel-level tasks across various models and modalities.

Contribution

It proposes a lightweight, per-image optimization framework that learns an anisotropic Gaussian kernel for feature upsampling, outperforming existing methods without dataset-specific retraining.

Findings

01

Achieves state-of-the-art results in semantic segmentation and depth estimation

02

Runs in approximately 0.419 seconds per 224x224 image

03

Effectively transfers across architectures and modalities

Abstract

We present \textbf{Upsample Anything}, a lightweight test-time optimization (TTO) framework that restores low-resolution features to high-resolution, pixel-wise outputs without any training. Although Vision Foundation Models demonstrate strong generalization across diverse downstream tasks, their representations are typically downsampled by 14x/16x (e.g., ViT), which limits their direct use in pixel-level applications. Existing feature upsampling approaches depend on dataset-specific retraining or heavy implicit optimization, restricting scalability and generalization. Upsample Anything addresses these issues through a simple per-image optimization that learns an anisotropic Gaussian kernel combining spatial and range cues, effectively bridging Gaussian Splatting and Joint Bilateral Upsampling. The learned kernel acts as a universal, edge-aware operator that transfers seamlessly across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications