Learngene Search Across Multiple Datasets for Building Variable-Sized Models

Boyu Shi; Junbo Zhou; Chang Liu; Xu Yang; Qiufeng Wang; Xin Geng

arXiv:2605.08209·cs.LG·May 12, 2026

Learngene Search Across Multiple Datasets for Building Variable-Sized Models

Boyu Shi, Junbo Zhou, Chang Liu, Xu Yang, Qiufeng Wang, Xin Geng

PDF

TL;DR

This paper introduces LSAMD, a method that searches across multiple datasets to extract transferable learngenes from a super model, enabling efficient variable-sized models with reduced costs.

Contribution

It proposes LSAMD, a novel approach that enhances learngene extraction by searching across datasets, improving performance and efficiency of variable-sized models.

Findings

01

LSAMD achieves performance comparable to traditional pretrain-finetune methods.

02

It significantly reduces storage and training costs.

03

The method effectively extracts dataset-specific learngenes from a super model.

Abstract

Deep learning methods are widely used under diverse resource constraints, resulting in models of varying sizes, such as the Vision Transformer (ViT) series. Deploying these models typically requires costly pretraining and finetuning. The Learngene paradigm addresses this issue by extracting transferable components, called learngenes, from a pretrained ancestry model (Ans-Net) to initialize variable-sized descendant models (Des-Nets).Existing learngene extraction methods rely on a single dataset, limiting downstream performance. To address this limitation, we propose Learngene Search Across Multiple Datasets for Building Variable-Sized Models (LSAMD). LSAMD expands the Ans-Net into a searchable super Ans-Net with dataset-specific blocks and dataset adapters (DADs). During training, LSAMD searches for an optimal architecture path for each dataset. The base blocks most frequently selected…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.