Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models
Qiufeng Wang, Xu Yang, Shuxia Lin, Jing Wang, Xin Geng

TL;DR
Learngene is a novel machine learning paradigm inspired by biological genes, focusing on condensing accumulated knowledge from an ancestor model into a compact form that can be inherited by descendant models to improve learning efficiency and performance.
Contribution
The paper introduces the concept of Learngene, a method for condensing and inheriting knowledge from an ancestor model to enhance descendant models' adaptation and learning speed.
Findings
Descendant models converge faster with Learngene.
Learngene reduces sensitivity to hyperparameters.
Models with Learngene perform better and need fewer training samples.
Abstract
During the continuous evolution of one organism's ancestry, its genes accumulate extensive experiences and knowledge, enabling newborn descendants to rapidly adapt to their specific environments. Motivated by this observation, we propose a novel machine learning paradigm Learngene to enable learning models to incorporate three key characteristics of genes. (i) Accumulating: the knowledge is accumulated during the continuous learning of an ancestry model. (ii) Condensing: the extensive accumulated knowledge is condensed into a much more compact information piece, i.e., learngene. (iii) Inheriting: the condensed learngene is inherited to make it easier for descendant models to adapt to new environments. Since accumulating has been studied in well-established paradigms like large-scale pre-training and lifelong learning, we focus on condensing and inheriting, which induces three key issues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications · Machine Learning and Data Classification
MethodsAttention Is All You Need · Adam · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Multi-Head Attention · Position-Wise Feed-Forward Layer · Residual Connection · Dense Connections
