Learngene: Inheriting Condensed Knowledge from the Ancestry Model to   Descendant Models

Qiufeng Wang; Xu Yang; Shuxia Lin; Jing Wang; Xin Geng

arXiv:2305.02279·cs.LG·June 30, 2023·2 cites

Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models

Qiufeng Wang, Xu Yang, Shuxia Lin, Jing Wang, Xin Geng

PDF

Open Access

TL;DR

Learngene is a novel machine learning paradigm inspired by biological genes, focusing on condensing accumulated knowledge from an ancestor model into a compact form that can be inherited by descendant models to improve learning efficiency and performance.

Contribution

The paper introduces the concept of Learngene, a method for condensing and inheriting knowledge from an ancestor model to enhance descendant models' adaptation and learning speed.

Findings

01

Descendant models converge faster with Learngene.

02

Learngene reduces sensitivity to hyperparameters.

03

Models with Learngene perform better and need fewer training samples.

Abstract

During the continuous evolution of one organism's ancestry, its genes accumulate extensive experiences and knowledge, enabling newborn descendants to rapidly adapt to their specific environments. Motivated by this observation, we propose a novel machine learning paradigm Learngene to enable learning models to incorporate three key characteristics of genes. (i) Accumulating: the knowledge is accumulated during the continuous learning of an ancestry model. (ii) Condensing: the extensive accumulated knowledge is condensed into a much more compact information piece, i.e., learngene. (iii) Inheriting: the condensed learngene is inherited to make it easier for descendant models to adapt to new environments. Since accumulating has been studied in well-established paradigms like large-scale pre-training and lifelong learning, we focus on condensing and inheriting, which induces three key issues…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Neural Networks and Applications · Machine Learning and Data Classification

MethodsAttention Is All You Need · Adam · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Multi-Head Attention · Position-Wise Feed-Forward Layer · Residual Connection · Dense Connections