Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones
Andrey Zhmoginov, Jihwan Lee, Mark Sandler

TL;DR
This paper introduces a method to create small, task-specific Transformer models from large foundation models, enabling efficient and focused performance on specific tasks, demonstrated on image modeling tasks.
Contribution
The paper proposes a parameter transformation technique to generate small, specialized Transformers from large models, reducing computational costs and improving task relevance.
Findings
Generated models outperform universal models on image tasks.
The approach reduces model size while maintaining or improving performance.
Task-specific models capture relevant knowledge more effectively.
Abstract
Modern Foundation Models (FMs) are typically trained on corpora spanning a wide range of different data modalities, topics and downstream tasks. Utilizing these models can be very computationally expensive and is out of reach for most consumer devices. Furthermore, most of the broad FM knowledge may actually be irrelevant for a specific task at hand. Here we explore a technique for mapping parameters of a large Transformer to parameters of a smaller specialized model. By making this transformation task-specific, we aim to capture a narrower scope of the knowledge needed for performing a specific task by a smaller model. We study our method on image modeling tasks, showing that performance of generated models exceeds that of universal conditional models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
