Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Salim Khazem

arXiv:2512.03663·cs.CV·December 4, 2025

Multi-Scale Visual Prompting for Lightweight Small-Image Classification

Salim Khazem

PDF

Open Access

TL;DR

This paper introduces Multi-Scale Visual Prompting (MSVP), a lightweight, backbone-agnostic module that enhances small-image classification performance by learning multi-scale prompts with minimal computational cost.

Contribution

The paper proposes MSVP, a simple, generic multi-scale prompting module that significantly improves small-image classification across various backbones with negligible overhead.

Findings

01

MSVP improves accuracy on MNIST, Fashion-MNIST, and CIFAR-10.

02

MSVP adds less than 0.02% parameters to models.

03

Multi-scale prompting provides effective inductive bias for low-resolution images.

Abstract

Visual prompting has recently emerged as an efficient strategy to adapt vision models using lightweight, learnable parameters injected into the input space. However, prior work mainly targets large Vision Transformers and high-resolution datasets such as ImageNet. In contrast, small-image benchmarks like MNIST, Fashion-MNIST, and CIFAR-10 remain widely used in education, prototyping, and research, yet have received little attention in the context of prompting. In this paper, we introduce \textbf{Multi-Scale Visual Prompting (MSVP)}, a simple and generic module that learns a set of global, mid-scale, and local prompt maps fused with the input image via a lightweight $1 \times 1$ convolution. MSVP is backbone-agnostic, adds less than $0.02%$ parameters, and significantly improves performance across CNN and Vision Transformer backbones. We provide a unified benchmark on MNIST,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning