Self-Supervised Weight Templates for Scalable Vision Model Initialization

Yucheng Xie; Fu Feng; Ruixiao Shi; Jing Wang; Yong Rui; Xin Geng

arXiv:2601.19694·cs.CV·January 28, 2026

Self-Supervised Weight Templates for Scalable Vision Model Initialization

Yucheng Xie, Fu Feng, Ruixiao Shi, Jing Wang, Yong Rui, Xin Geng

PDF

Open Access

TL;DR

SWEET introduces a self-supervised, modular pre-training framework that learns a shared weight template and size-specific scalers, enabling scalable and flexible initialization of vision models across various architectures and tasks.

Contribution

It proposes a novel Tucker-based factorization approach for learning a shared weight template and scalers, supporting flexible model adaptation and width-invariant representations.

Findings

01

Achieves state-of-the-art results on classification, detection, segmentation, and generation tasks.

02

Supports efficient initialization for variable-sized models with minimal data.

03

Enhances cross-width generalization through width-wise stochastic scaling.

Abstract

The increasing scale and complexity of modern model parameters underscore the importance of pre-trained models. However, deployment often demands architectures of varying sizes, exposing limitations of conventional pre-training and fine-tuning. To address this, we propose SWEET, a self-supervised framework that performs constraint-based pre-training to enable scalable initialization in vision tasks. Instead of pre-training a fixed-size model, we learn a shared weight template and size-specific weight scalers under Tucker-based factorization, which promotes modularity and supports flexible adaptation to architectures with varying depths and widths. Target models are subsequently initialized by composing and reweighting the template through lightweight weight scalers, whose parameters can be efficiently learned from minimal training data. To further enhance flexibility in width expansion,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning