Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones

Andrey Zhmoginov; Jihwan Lee; Mark Sandler

arXiv:2506.05641·cs.LG·June 9, 2025

Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones

Andrey Zhmoginov, Jihwan Lee, Mark Sandler

PDF

Open Access

TL;DR

This paper introduces a method to create small, task-specific Transformer models from large foundation models, enabling efficient and focused performance on specific tasks, demonstrated on image modeling tasks.

Contribution

The paper proposes a parameter transformation technique to generate small, specialized Transformers from large models, reducing computational costs and improving task relevance.

Findings

01

Generated models outperform universal models on image tasks.

02

The approach reduces model size while maintaining or improving performance.

03

Task-specific models capture relevant knowledge more effectively.

Abstract

Modern Foundation Models (FMs) are typically trained on corpora spanning a wide range of different data modalities, topics and downstream tasks. Utilizing these models can be very computationally expensive and is out of reach for most consumer devices. Furthermore, most of the broad FM knowledge may actually be irrelevant for a specific task at hand. Here we explore a technique for mapping parameters of a large Transformer to parameters of a smaller specialized model. By making this transformation task-specific, we aim to capture a narrower scope of the knowledge needed for performing a specific task by a smaller model. We study our method on image modeling tasks, showing that performance of generated models exceeds that of universal conditional models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning